ocaml / dune

A composable build system for OCaml.
https://dune.build/
MIT License
1.6k stars 397 forks source link

bytecode + c stubs not working as expected #108

Open rgrinberg opened 7 years ago

rgrinberg commented 7 years ago

See this repo:

https://github.com/rgrinberg/jbuilder-c-stubs

[rgrinberg:~/tmp/c-ocaml] 1 % jbuilder build hello_world.exe
[rgrinberg:~/tmp/c-ocaml] % _build/default/hello_world.exe
Hello
[rgrinberg:~/tmp/c-ocaml] % _build/default/hello_world.bc
Fatal error: cannot load shared library dllc_lib_stubs
Reason: dllc_lib_stubs.so: cannot open shared object file: No such file or directory

Am I doing something wrong here?

ghost commented 7 years ago

ocamlrun looks up shared libraries for C stubs using CAML_LD_LIBRARY_PATH. So you'd need to set it. jbuulder exec ... does that for you

rgrinberg commented 7 years ago

Hm, but CAML_LD_LIBRARY_PATH seems correct to me:

[rgrinberg:~/tmp/c-ocaml] % echo $CAML_LD_LIBRARY_PATH
/home/rgrinberg/.opam/4.04.1/lib/stublibs:/home/rgrinberg/.opam/4.04.1/lib/ocaml/stublibs:/home/rgrinberg/.opam/4.04.1/lib/ocaml

opam switch is supposed to set it right?

But regardless, my executable isn't public, can I still somehow run it? I can make it public, but it seems annoying to have to make executables just to run them.

ghost commented 7 years ago

Well, it only points to things that are installed. In your example the dll for the stubs is in _build/...

As long as the library with the stubs is public you'll be able to run it with jbuilder exec .... It's a limitation of shared libraries really, jbuilder can't do much about it. If you want to call the executable inside a rule, you should just use thr .exe

rgrinberg commented 7 years ago

Hmm, I'm still unable to run the bytecode binary this way. Even after setting CAML_LD_LIBRARY_PATH to what jbuilder exec would. See:

[rgrinberg:~/tmp/c-ocaml] master(+2/-1)* 2 ± jbuilder exec -- env sh -c 'echo $CAML_LD_LIBRARY_PATH'
/home/rgrinberg/tmp/c-ocaml/_build/install/default/lib/stublibs:/home/rgrinberg/.opam/4.04.1/lib/stublibs:/home/rgrinberg/.opam/4.04.1/lib/ocaml/stublibs:/home/rgrinberg/.opam/4.04.1/lib/ocaml
[rgrinberg:~/tmp/c-ocaml] master(+2/-1)* ± jbuilder exec -- env _build/default/hello_world.bc
Fatal error: cannot load shared library dllc_lib_stubs
Reason: dllc_lib_stubs.so: cannot open shared object file: No such file or directory

Seems like CAML_LD_LIBRARY_PATH is set correctly here but the bytecode isn't running.

ghost commented 7 years ago

Does the dllc_lib_stubs.so exist under _build/.../stublibs? Note that if the library is not public, it won't be present in _build/install/...

rgrinberg commented 7 years ago

It didn't exist, thanks for the hint. My library was actually public, but i still ran into that error. Looks like I still needed to build the native target to run the bytecode executable. Note sure if that's expected.

[rgrinberg:~/tmp/c-ocaml] master(+3/-1)* 2 ± rm -rf _build
[rgrinberg:~/tmp/c-ocaml] master(+2/-1)* ± jbuilder build hello_world.bc
    ocamldep hello_world.depends.ocamldep-output
    ocamldep c_lib.depends.ocamldep-output
      ocamlc c_lib.{cmi,cmo,cmt}
      ocamlc c_lib__Hello_world.{cmi,cmo,cmt}
      ocamlc c_lib.cma
      ocamlc hello_world.{cmi,cmo,cmt}
      ocamlc hw.o
  ocamlmklib dllc_lib_stubs.so,libc_lib_stubs.a
      ocamlc hello_world.bc
[rgrinberg:~/tmp/c-ocaml] master(+2/-1)* ± jbuilder exec -- env _build/default/hello_world.bc
Fatal error: cannot load shared library dllc_lib_stubs
Reason: dllc_lib_stubs.so: cannot open shared object file: No such file or directory
[rgrinberg:~/tmp/c-ocaml] master(+2/-1)* 2 ± jbuilder build
    ocamlopt c_lib.{cmx,o}
    ocamlopt c_lib__Hello_world.{cmx,o}
    ocamlopt c_lib.{a,cmxa}
    ocamlopt c_lib.cmxs
[rgrinberg:~/tmp/c-ocaml] master(+2/-1)* ± jbuilder exec -- env _build/default/hello_world.bc
Hello
ghost commented 7 years ago

Hmm, I see. Technically to build the byte-code executable you don't need the stub. You need it to run it though. Technically we could make the .bc depend on all the stubs. However, then you might want to use in a custom rule and it's starting to become complicated as jbuilder must set CAML_LD_LIRARY_PATH properly.

I wonder if something needs to be done. I kind of prefer to encourage to just use the native version. Especially it is available even when there is no ocamlopt. Maybe for future improvement.

rgrinberg commented 7 years ago

OK, then I think leaving a note about the current situation is enough. This isn't an issue in practice anyway.

ghost commented 7 years ago

I added a note about byte-code executables in the manual

didier-wenzek commented 7 years ago

I experienced a related issue.

I fail to load from utop a library defined with a c-stubs. Utop say "Cannot load required shared library. Cannot open shared object file: No such file or directory. ".

Indeed, the shared library is installed by jbuilder/opam in a directory named stubslibs instead of stublibs.

$ jbuilder install

# opam-version    1.2.2
# os              linux
...
_build/install/default/lib/stublibs/dllkyoto_stubs.so => /home/didier/.opam2/4.04.2+flambda/lib/stubslibs/dllkyoto_stubs.so
...
trefis commented 7 years ago

I ran into the same issue a few days ago. I believe this might be related to https://github.com/ocaml/opam/commit/718d6198e338069852dcdde067f56df236235370 .

Which version of opam/opam-installer are you using?

(ping @altgr)

dra27 commented 7 years ago

As @trefis notes, it's entirely to do with a very long-standing bug in opam-installer. You can fix it locally by updating to opam2 beta 4, but I think this makes a very strong case for having jbuilder install process the .install itself.

Note that installs via opam should be unaffected - there is (intentional) code duplication between opam and opam-installer, so as long as your opam file doesn't incorrectly try to run jbuilder install then opam packages are fine.

didier-wenzek commented 7 years ago

Thanks to @trefis & @dra27.

I have the issue both with opam 1.2 and 2.0.0~beta3. I didn't try with beta 4.

I don't understand what you mean by "having jbuilder install process the .install itself." How can the opam file incorrectly run jbuilder install ?

dra27 commented 7 years ago

I'm suggesting that jbuilder install should not use opam-installer because we know there are buggy versions out there (and there will be for a while, because of LTS distros) but instead replicate the functionality of opam-installer.

An opam file should never include a jbuilder install command - .install files are Opam files (that's why jbuilder at the moment calls opam-installer) and Opam will automatically process it itself at the end of the build. This is the direction all build systems are supposed to head in - generate a .install file which Opam processes, rather than running commands.

didier-wenzek commented 7 years ago

Thanks for this explanation. It makes me notice that the projects ported to jbuilder removed the install stanza from their opam file.

Indeed, in my case, I left a line install: [make "install"] in the opam file, indirectly triggering a jbuilder install.

Khady commented 5 years ago

I'm facing this issue too.

The stubs are defined like this in a package pa:

(library
 (name stubs)
 (public_name stubs)
 (modules ())
 (c_names foo_stubs foo)
 (c_flags -g -O2 -Wextra -Wstrict-overflow=5))

In the same package there is another library la depending on this stubs. Both the stubs and la are defined in the same directory.

In a package pb I have a library lb depending on those stubs. And an executable eb that depends on lb. When I run dune exec eb.bc, the stubs are not in path and I get an error like this:

Fatal error: cannot load shared library dllstubs_stubs
Reason: dlopen(dllstubs_stubs.so, 138): image not found

I can run like this and it works

CAML_LD_LIBRARY_PATH=_build/default/pa:$CAML_LD_LIBRARY_PATH _build/default/pb/src/eb.bc

if pa.la is built using dune build -p pa, the stubs is "installed" and thendune exec eb.bc works. I'm thinking that dune exec should be able to know about the dependency on the stubs and make them available in path.

rgrinberg commented 5 years ago

@diml I think this issue has enough negative consequences that it should be addressed - loading stubs in utop is quite useful. Here's another odd situation where this bug can break things: a ppx that uses a C stub will not work on a bytecode only switch. Quite unlikely this will ever happen, but we'd like to make sure bytecode works in all circumstances.

rgrinberg commented 5 years ago

Anyways, building the dll's for bytecode exe's is simple enough, but it's just part of the story. We also need to add it CAML_LD_LIBRARY_PATH somehow. Would an action that appends to a PATH like variable make sense? This seems similar enough to the case of (bin ..) dependencies where we'd like the binary to be in PATH. Here we'd like the shared object to be in CAML_LD_LIBRARY_PATH.

ghost commented 5 years ago

Note that when native compilation is not available or when the user sets the modes to just byte, we still setup a rule to build prog.exe as a custom bytecode program: i.e. ocamlrun+all stubs linked statically+bytecode. So it is always possible to execute a bytecode program.

To support running non-custom bytecode programs, we need to add a notion of runtime dependencies. Essentially, whenever we execute prog.bc we must remember to implicitly add all the .so as dependencies. Basically this would be the same as writing this everywhere:

(rule
 (deps <deps> ... .../dllstub1.so .../dllstub2.so)
 (action (run .../prog.bc ...)))

Regarding CAML_LD_LIBRARY_PATH, we already expand it with _build/install/default/lib/stublibs. This is enough for public libraries but not for private ones. We could either extend it with the various path of the private libraries, or extend it with a single load path per project where we would symlink all the private .so files.

yakobowski commented 5 years ago

I'm having the same problem(s) trying to drop the dependency of Infer to Core. I am adding a small stubs.c file, but then using the bytecode executables (infer.bc + a custom toplevel) becomes really difficult. In particular, as @Khady mentioned above, I need to run dune build -p <myLib> in order for the symlink to dll<MyLib>_stubs.so to be created. Otherwise, dune exec infer.bc does not work at all, because dll<MyLib>_stubs.so does not get "installed" into _build/install/<profile>/lib/stublibs/.

Is this the expected behavior? If so, is there a less "aggressive" target that I could run to force the creation of the symlink? dune build -p <myLib> seems to have an adverse effect on other parts of the build process.

ghost commented 5 years ago

-p is only for release builds, you shouldn't use it during development. If you want to build all the files of a package (include creating the symlinks), you can do: dune build <pkg>.install.

ejgallego commented 5 years ago

We have the same problem in Coq; there is a bytecode version of coqtop that links to the OCaml toplevel for debug purposes. Doing dune exec coqtop.byte doesn't work.

We will be able to workaround that with a (fragile) custom build rule, but it'd be great if dune exec would properly install the stubs or set the path right.

yakobowski commented 5 years ago

Can you link to the commit that implements this custom rule? I would like to compare it with the "brittleness" of my hack for Infer.

ejgallego commented 5 years ago

@yakobowski this is what we use:

 (deps
  %{bin:coqtop.byte}
  %{lib:coq.kernel:../../stublibs/dllbyterun_stubs.so}
  %{project_root}/theories/Init/Prelude.vo)

can you point out to the Infer hack? Thanks!

yakobowski commented 5 years ago

It is in https://github.com/yakobowski/infer/commit/5fb35485b02418e682f93c2555f91593def5f3c7, referenced above. But it is a Makefile-based solution, I would much prefer a Dune-based one.

ejgallego commented 5 years ago

I guess you can as we do and have an alias that pulls the stubs and the binary; but indeed it is a gross hack as you can see.

https://github.com/coq/coq/pull/9748/files

vsiles commented 5 years ago

Hi ! I just had the same issue (I baked a toy example in https://github.com/vsiles/dune-debug-test) The two workarounds I got were:

thautwarm commented 4 years ago

@vsiles Thank you! I searched everywhere for a thousand years and you saved my life!

ghost commented 4 years ago

Just a tweak to @vsiles solution: rather than passing -custom explicitly you should use (modes byte_complete). For these two reasons:

vsiles commented 4 years ago

Thanks for this info. Do you know in which version -custom has been deprecated (we are lagging quite a bit, so I'll have to update at some point :D)

ghost commented 4 years ago

The PR to deprecate it is not merged yet: https://github.com/ocaml/ocaml/pull/9236

The new option was introduced in 4.10.

frejsoya commented 1 year ago

I hit this issue using ctypes + mdx. The mdx stanza creates `mdx-gen.ml, but mdx-gen.bc fails to run. It dlopens the stub fine, but the dynamic linked library itself is not found (missing symbol).

Mac OSX

ELLIOTTCABLE commented 9 months ago

We're still running into this (see @Khady's post above; we have a pretty pervasive hack involving CAML_LD_LIBRARY_PATH-setting in a bunch of our internal binaries ...)

Unfortunately, that doesn't seem to be helpful in the case mentioned by @frejsoya — I can't feed CAML_LD_LIBRARY_PATH to the mdx_gen.bc invocation:

$ export CAML_LD_LIBRARY_PATH="$scriptdir/../../_build/default/backend/ahrefskit:$CAML_LD_LIBRARY_PATH"

$ dune runtest --verbose
... SNIP ...

Running[1]: (cd _build/default/backend/COMPONENT/client && ./mdx_gen.bc using_ppx.mld) > _build/default/backend/COMPONENT/client/.mdx/using_ppx.mld.corrected
File "backend/COMPONENT/client/dune", line 36, characters 0-67:
36 | (mdx
37 |  (files using_ppx.mld)
38 |  (libraries COMPONENT.ppx_COMPONENT))

Command [1] got signal ABRT:
 $ (cd _build/default/backend/COMPONENT/client && ./mdx_gen.bc using_ppx.mld) > _build/default/backend/COMPONENT/client/.mdx/using_ppx.mld.corrected
Fatal error: cannot load shared library dllahrefskit_stubs_stubs
Reason: dllahrefskit_stubs_stubs.so: cannot open shared object file: No such file or directory

Unfortunately, the thing-under-test is a ppx; which doesn't accept (modes byte_complete); and (ocamlc_flags (-custom)) is no more helpful:

$ dune runtest
... SNIP ...
File "backend/COMPONENT/client/dune", line 36, characters 0-67:
36 | (mdx
37 |  (files using_ppx.mld)
38 |  (libraries COMPONENT.ppx_COMPONENT))
/usr/bin/ld: cannot find -lmysql_stubs
/usr/bin/ld: cannot find -lmurmur3_stubs
/usr/bin/ld: cannot find -lmariadb_stubs
/usr/bin/ld: cannot find -lctypes_stubs
... SNIP ...
/usr/bin/ld: cannot find -lcurl_stubs
collect2: error: ld returned 1 exit status
File "_none_", line 1:
Error: Error while building custom runtime system

Is there any workaround to this that will function in a ppx-rewriter binary, when using (mdx …)? /=

frejsoya commented 9 months ago

@ELLIOTTCABLE i made this pr https://github.com/ocaml/dune/pull/8784 some time ago. Which might be related. I should have followed up on it :)

For non-installed shared libraries, at build time and + avoiding global env flags, it is possible to use relative paths to the binary with rpath (Runtime-paths). Reference for how CMAKE does this https://gitlab.kitware.com/cmake/community/-/wikis/doc/cmake/RPATH-handling