ocaml / dune

A composable build system for OCaml.
https://dune.build/
MIT License
1.6k stars 395 forks source link

Displaying full filenames in dune-site before load #8231

Open maroneze opened 1 year ago

maroneze commented 1 year ago

Desired Behavior

I'd like dune-site to have more fine-grained ways to obtain debugging/logging information about which files it is loading.

More specifically, I'm trying to debug why Frama-C (WP plug-in in particular) fails to load when using a Cygwin-compiled OCaml (without MinGW), while it works when a MinGW-based installation is present. Currently, all I have is the message:

Dynlink.Error (Dynlink.Cannot_open_dll "Dynlink.Error (Dynlink.Cannot_open_dll \"Failure(\\\"Permission denied\\\")\")")

But I have no idea which file is causing this, and I don't see an easy way to get this information. I tried using strace, and it seems to indicate something related to .opam\4.14.1\lib\ocaml\nums.cmxs, but it's hard to understand the log.

And since dune-site is the one actually responsible for defining the actual filename (after lookup and possibly some conversions, including concatenation of the directory name), when I try to debug it from Frama-C, I get no useful feedback (only that the plug-in that failed to load is WP, which I already knew).

Ideally, some environment variable, or some other setting (e.g. an optional argument) should allow outputting the specific information about which file(s) and directory(ies) are being loaded, so that if something fails later, the user can more easily understand what happened.

Example

Assuming e.g. a DUNE_SITE_VERBOSE=1 environment variable, when Dune_site_plugins.V1.load is called, it would print (to stderr or somewhere else) phrases such as dune-site: Loading '<file>', so that any subsequent errors would be easier to debug.

If there is already something like that, please let me know so I can try it.

Alizter commented 1 year ago

You can try --always-show-command-line which will show a shell command approximation of the action that raised the error message. You can then inspect the path to see where the error might be coming from.

maroneze commented 1 year ago

Sorry, I'm a bit confused about where to add such option. Dune-site is used in Frama-C to dynamically load files during execution, so the command-line is not Dune's, but Frama-C's; it's the OCaml code that calls dune-site during execution. Even if I try dune exec --always-show-command-line -- frama-c, it doesn't seem to do anything.

For me, either an API change (adding optional verbose or debug arguments to the load function) or some environment variables would be required to make it work.

Alizter commented 1 year ago

I guess the next thing I would try is to try and print any backtrace that is attached to the dynamic linking code. I'm guessing at some point in Frama-C you are calling the dune site libraries? You can try and wrap those calls in a try-catch. We have an example of this kind of debugging in the tests for dune-site actually:

Passing --always-show-command-line to dune exec most likely isn't helpful since the failure appears during execution rather than during build time.

maroneze commented 1 year ago

To be honest, the underlying problem seems to lie in OCaml's, Dynlink module, since the stack trace points to it (Frama-C already prints a stack trace in the case of such errors):

  Raised at Dynlink_common.Make.load in file "otherlibs/dynlink/dynlink_common.ml", line 330, characters 23-73
  Called from Stdlib__List.iter in file "list.ml", line 110, characters 12-15
  Called from Stdlib__List.iter in file "list.ml", line 110, characters 12-15
  Called from Dune_site_plugins__Plugins.load_gen in file "otherlibs/dune-site/src/plugins/plugins.ml", line 255, characters 4-36
  Called from Stdlib__List.iter in file "list.ml", line 110, characters 12-15
  Called from Dune_site_plugins__Plugins.load_gen in file "otherlibs/dune-site/src/plugins/plugins.ml", line 255, characters 4-36
  Called from Stdlib__List.iter in file "list.ml", line 110, characters 12-15
  Called from Dune_site_plugins__Plugins.load_gen in file "otherlibs/dune-site/src/plugins/plugins.ml", line 255, characters 4-36
  Called from Stdlib__List.iter in file "list.ml", line 110, characters 12-15
  Called from Stdlib__List.iter in file "list.ml", line 110, characters 12-15
  Called from Frama_c_kernel__Kernel.bootstrap_loader in file "src/kernel_services/plugin_entry_points/kernel.ml", line 933, characters 35-62

The exception is supposed to contain a message which would ultimately display the filename, but for some reason I still cannot understand, some error printer seems to have erased it with a useless message repeating the exception (as if the error printer itself failed to get the filename?):

Dynlink.Error (Dynlink.Cannot_open_dll "Dynlink.Error (Dynlink.Cannot_open_dll \"Failure(\\\"Permission denied\\\")\")")

So the actual useful content is not there.

Back in the days before Frama-C used Dune, I could relatively easily modify Frama-C's code to output the information I needed, but now that dune-site sits between Frama-C and Dynlink, it's harder to recompile it, reinstall it, then recompile Frama-C, and see what's happening. Hence my feature request concerning a possible "debugging mode" for dune-site, which would possibly help in this case and in others, in the future.

What's really annoying is that, since it only happens on Cygwin, I cannot reproduce it anywhere else. And Cygwin is painfully slow to recompile and test things...

Alizter commented 1 year ago

Some more questions:

I'm open to adding some debugging flags that would help users of the sitelib API debug any issues with Findlib.

Alizter commented 1 year ago

also cc @bobot

maroneze commented 1 year ago

I'm using the public master branch of Frama-C.

I simply installed Cygwin, then the minimal packages (as indicated in opam-wix's README, ocaml, opam, gcc-g++, but also ocaml-ocamldoc (required by zarith, one of Frama-C's dependencies), autoconf, libgmp-devel, zlib-devel), then installed most of Frama-C's dependencies (opam install alt-ergo-free.2.2.0 dune.3.5.0 dune-configurator.3.5.0 dune-site.3.5.0 menhir.20211128 ocamlfind.1.9.6 ocamlgraph.2.0.0 ppx_deriving_yaml.0.2.1 ppx_deriving_yojson.3.7.0 ppx_import.1.10.0 unionFind.20220122 why3.1.6.0 yojson.2.0.2 zarith.1.12), then built Frama-C (make), and then ran it (dune exec -- frama-c, or I could also have installed it first with make install and then just run frama-c).

Just for comparison, I did a chmod u-x of my Linux' ocaml/nums.cmxs, and the error message did contain the file name:

Dynlink.Error (Dynlink.Cannot_open_dll 
  "Dynlink.Error (Dynlink.Cannot_open_dll 
    \"Failure(\\\"/home/user/.opam/4.13.1/lib/ocaml/nums.cmxs: 
      cannot open shared object file: Permission denied\\\")\")")

So it's really a Cygwin- (or Windows-) specific issue.

But I still think that, in some scenarios, tracking the loading of such files with debugging features would be useful, especially the indirect dependencies (e.g. loading the WP plug-in causes why3 to be loaded, which causes num to be loaded, etc).

Alizter commented 1 year ago

OK, so the takeaways are:

Alizter commented 1 year ago

By the way, this might be of interest:

Alizter commented 1 year ago

I am able to reproduce the permission denied error on linux, and it appears as though the error message is not being scrubbed. I would therefore think that cygwin might be doing something. It could also be that the C implementation of dynlink does a stat in order to get the file name which might be failing and it isn't actually able to report the file name.

We could still add the debugging information, but I fail to see how it would be useful to anybody but cygwin users who cannot use dynlink anyway.

maroneze commented 1 year ago

I agree with your takeaways (except for the fact that I'm on the Frama-C team and I'm trying to improve support for Cygwin).

Well, after reinstalling Cygwin on a new computer, and retrying strace, this time it shows that the actual issue is with why3 (its .cmxs file is not executable, at least on Cygwin).

This solves the first issue, but not the missing filename.

If you think the debugging information is really not going to serve anyone else, I'll close this issue, and if in the future a similar situation arises, I'll open it again.

Alizter commented 1 year ago

Let's leave this issue open for the time being, I'm currently having a look at adding debug info.

maroneze commented 1 year ago

Ok thanks, feel free to close it whichever way you decide.

Alizter commented 1 year ago

I've added a DUNE_SITE_DEBUG flag which will report which files it is trying to load. https://github.com/ocaml/dune/pull/8236

maroneze commented 1 year ago

This is really cool (at least for me)!

I went back to my first Cygwin installation, and tried applying the why3 fix as I mentioned above, but it didn't work: I still had the Permission denied error.

So I got your branch, recompiled Dune and all of its packages (took me more than two hours, my test machine is not the fastest), and then re-ran Frama-C with DUNE_SITE_DEBUG=1.

Instantly, it showed me that the last file it was trying to load was nums.cmxs.

I have no idea why, but in my previous Cygwin installation, that file was also installed without execution permission. So my previous strace attempt was actually correct. But when I fixed it the first time (I believe I did try chmod +x on it), it failed again (this time, due to the why3.cmxs issue, but I didn't know). Because I had disabled strace by then (otherwise its output was too verbose and I couldn't get a clear reading), I assumed it was my fix that was incorrect, and went off on a tangent.

With DUNE_SITE_DEBUG, the output being much shorter, it was immediate which file was causing the problem, so I just chmod'ed it as well, and voilà, Frama-C now works on Cygwin.

Besides this debugging, there is also another situation (which can arrive on Linux as well) where this can be useful: when reinstalling Frama-C over an existing installation, leftovers (e.g. extra plugins) can cause dynlink errors. I believe DUNE_SITE_DEBUG could also help quickly understand if that's the issue, or something else. So I believe other Frama-C users (and developers) might benefit from this in the future.

Alizter commented 1 year ago

That's wonderful to hear! Hopefully this will be merged before we release 3.10.