ocaml / dune

A composable build system for OCaml.
https://dune.build/
MIT License
1.6k stars 395 forks source link

does dune enforce complete cmxs files? #3908

Open nilsbecker opened 3 years ago

nilsbecker commented 3 years ago

This is a follow-up to #3889 .

After some iterations it turned out that the file eigen_cpp_stub.cmxs in the Eigen library does not contain the necessary C stubs. This may be due to automatic dune build rules being insufficient for generating complete .cmxs files. if so, that could be considered a bug in dune.

Expected Behavior

Eigen can be loaded in the native ocaml toplevel via Omod and can be dynamically linked to native code via Dynlink.

Actual Behavior

Loading via Omod fails, as reported in https://github.com/ocaml/dune/issues/3889#issuecomment-716199900

Reproduction

  1. install the native ocaml toplevel.
  2. opam install eigen omod
  3. in ocamlnat, #use "omod.nattop";; Omod.load "Eigen";;

alternatively, the same issue should also appear when attempting to link eigen in native code via Dynlink. that may be easier to reproduce since it does not require installing ocamlnat. however, i have not tried.

Specifications

see also these issue comments:

https://github.com/owlbarn/eigen/issues/27#issuecomment-720554965 https://github.com/dbuenzli/omod/issues/12#issuecomment-720537270

nojb commented 3 years ago

When you have a shared library (under Unix, a .cmxs is essentially a shared library), it is typically assumed that its dependent libraries will be linked into the main program that will be loading the shared library (this is how gcc works). If one where to link the dependent libraries into the shared object, dynamic linking of a second, similar, shared object would fail due to the duplicated symbols.

Moreover, linking a static library into a shared object is somewhat tricky and system- and platform-dependent, so it is not a very common operation.

Having said that, you should be able to dynlink eigen_cpp_stubs.cmxs if you are able to load dlleigen_cpp_stubs_stubs.so beforehand (but am not very familiar with ctypes, omod, etc...).

dbuenzli commented 3 years ago

This is not a problem that requires ctypes or omod familiarity.

Simply said trying to use that cmxs file with the native Dynlink API fails. Here's a repro for trying to load eigen and its deps (ctypes and eigen.cpp):

> opam install eigen
> cat load.ml
let () =
  let root = Sys.argv.(1) in
  let load cmxs = Dynlink.loadfile (root ^ "/" ^ cmxs) in
  Dynlink.allow_unsafe_modules true;
  try
    load "integers/integers.cmxs";
    load "ctypes/ctypes.cmxs";
    load "eigen/cpp/eigen_cpp_stubs.cmxs";
    load "eigen/eigen.cmxs";
    print_endline "Okay!";
    exit 0
  with
  | Dynlink.Error e -> print_endline (Dynlink.error_message e); exit 2
> ocamlopt -linkall dynlink.cmxa load.ml
> ./a.out $(opam var lib)
error loading shared library: Dynlink.Error (Dynlink.Cannot_open_dll "Failure(\"dlopen(/Users/dbuenzli/.opam/4.09.0/lib/eigen/eigen.cmxs, 10): Symbol not found: _c_eigen_dsmat_c_cols\\n  Referenced from: /Users/dbuenzli/.opam/4.09.0/lib/eigen/eigen.cmxs\\n  Expected in: flat namespace\\n in /Users/dbuenzli/.opam/4.09.0/lib/eigen/eigen.cmxs\")")

The missing symbol can be found in $(opam var lib)/eigen/cpp/libegen_cpp_stubs_stubs.a or in $(opam var stublibs)/dlleigen_cpp_stubs_stubs.so.

So you need to either:

  1. Register the .so stubs with the C dynlinker for that eigen_cpp_stubs.cmxs so that they get loaded by the C dynlinker before the cmxs file does on Dynlink.loadfile
  2. Put the stubs directly in the eigen_cpp_stubs.cmxs file itself by compiling with:
    ocamlopt -shared -o eigen_cpp_stubs.cmxs eigen_cpp_stubs.cmxa libeigen_cpp_stubs_stubs.a 

@nilsbecker confirmed 2. is not being done and on macOS here an otool -L on the cmxs indicates that 1. is also not being done.

> otool -L $(opam var lib)/eigen/cpp/eigen_cpp_stubs.cmxs
/Users/dbuenzli/.opam/4.09.0/lib/eigen/cpp/eigen_cpp_stubs.cmxs:
    eigen_cpp/eigen_cpp_stubs.cmxs (compatibility version 0.0.0, current version 0.0.0)
    /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 902.1.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1281.100.1)

If one where to link the dependent libraries into the shared object, dynamic linking of a second, similar, shared object would fail due to the duplicated symbols.

Note that in the particular case of stubs, it's not really a library to be used by many different entities, usually the stubs are "owned" by a corresponding OCaml library they are not meant to be used separately so I think I'd rather put the stubs in the .cmxs.

That being said maybe @bobot who certainly has more experience about these things may know what the best course of action between 1. and 2. above is.

nojb commented 3 years ago

Hi Daniel,

So you need to either:

  1. Register the .so stubs with the C dynlinker for that eigen_cpp_stubs.cmxs so that they get loaded by the C dynlinker before the cmxs file does on Dynlink.loadfile
  2. Put the stubs directly in the eigen_cpp_stubs.cmxs file itself by compiling with:
ocamlopt -shared -o eigen_cpp_stubs.cmxs eigen_cpp_stubs.cmxa libeigen_cpp_stubs_stubs.a 

@nilsbecker confirmed 2.

I don't think 2 works: if you try that command-line, the static library will just be ignored. One issue is that there is no portable way to perform that operation (link a static library into a shared object).

Do you know how to do 1.?

dbuenzli commented 3 years ago

I don't think 2 works: if you try that command-line, the static library will just be ignored.

I think this should work (it doesn't) since that is the way the cmxs itself is produced (the ocaml library's .a + the plugin startup .o glue). If you look at the linker invocation with -verbose you can see it gets on the cli.

More interestingly however, looking at the following invocation. It is in fact already mentioned on the cli via -leigen_cpp_stubs_stubs by virtue of the cmxa C metadata. It just seems to do nothing about it:

ocamlopt -verbose -shared -o /tmp/bla.o eigen_cpp_stubs.cmxa
[...]
+ cc -shared -flat_namespace -undefined suppress  
-Wl,-no_compact_unwind -o '/tmp/bla.o'  '-L/Users/dbuenzli/.opam/4.09.0/lib/ocaml'  '/tmp/bla.o.startup.o' 
'/Users/dbuenzli/.opam/4.09.0/lib/eigen/cpp/eigen_cpp_stubs.a' 
'-leigen_cpp_stubs_stubs' '-lstdc++' 

I'm staring to wonder if this is not an macos issue. Could someone nm that $(opam var lib)/eigen/cpp/eigen_stubs_stubs.cmxs on linux to see if it has the symbols of libeigen_cpp_stubs_stubs.a.

Do you know how to do 1.?

Not sure but I think you should simply mention the libraries via -l on the cli when linking in -shared mode (e.g. in our example that's the way the final .cmxs gets it's dependency on the C++ runtime reported by otool).

But let's not do that, these things in $(opam var stublibs) were supposed to be for bytecode. For one thing there's a naming scheme that does not correspond to the libraries mentioned in the cmxa's C metadata (the dll prefix). I suspect we'd introduce more mess than solving problems by following that route.

dbuenzli commented 3 years ago

Could someone nm that $(opam var lib)/eigen/cpp/eigen_stubs_stubs.cmxs on linux to see if it has the symbols of libeigen_cpp_stubs_stubs.a.

I'd be really curious about that now.

Reading a bit man ld didn't bring much light about how macos' ld really works :-). However that option caught my eye:

  -all_load   Loads all members of static archive libraries.

Adding that option:

ocamlopt -linkall -shared -o /tmp/bla.o eigen_cpp_stubs.cmxa -cclib '-Wl,-all_load'

the symbols seems to be all in (by the simple virtue of the cmxa's C metadata which adds, -leigen_cpp_stubs_stubs on the cli, which gets resolved to libeigen_cpp_stubs_stubs.a by ld).

If this exactly what happens on linux by default then maybe we should work something upstream.

nilsbecker commented 3 years ago

ok, i was able to try this on a linux system,

uname -a
Linux odcf-worker01 3.10.0-693.21.1.el7.x86_64 #1 SMP Wed Mar 7 19:03:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

i installed a fresh 4.11.1 switch, did opam install eigen and then

ocamlopt -linkall dynlink.cmxa load.ml
./a.out $(opam var lib)

gives

error loading shared library: Dynlink.Error (Dynlink.Cannot_open_dll "Failure(\"/home/beckern/.opam/411/lib/eigen/eigen.cmxs: undefined symbol: c_eigen_dsmat_s_data\")")

i can try with other linker options if someone tells me which ones! -all_load does not seem to exist on that system

dbuenzli commented 3 years ago

@nilsbecker I'm sorry I don't have the time to look into this myself on a linux box at the moment.

But I'd like to know the C linker invocation and what happens on linux with the resulting symbols in the cmxs (use nm(1)) on linux if you cd to the appropriate place and do:

ocamlopt -verbose -linkall -shared -o /tmp/nolib.cmxs eigen_cpp_stubs.cmxa
ocamlopt -verbose -linkall -shared -o /tmp/withlib.cmxs eigen_cpp_stubs.cmxa libeigen_cpp_stubs_stubs.a
nilsbecker commented 3 years ago

ok, i went into the lib install directory of the opam switch on the linux box.

-bash-4.2$ ocamlopt -verbose -linkall -shared -o /tmp/nolib.cmxs eigen_cpp_stubs.cmxa
+ as  -o '/tmp/nolib.cmxs.startup.o' '/tmp/camlstartup652959.s'
+ gcc -shared -o '/tmp/nolib.cmxs'  '-L/home/beckern/.opam/411/lib/ocaml'  '/tmp/nolib.cmxs.startup.o' 'eigen_cpp_stubs.a' '-leigen_cpp_stubs_stubs' '-lstdc++'
/usr/bin/ld: cannot find -leigen_cpp_stubs_stubs
collect2: error: ld returned 1 exit status
File "caml_startup", line 1:
Error: Error during linking (exit code 1)

somehow '_stubs' is auto-appended. i tried without the suffix '_stubs' but that cmxa file does not exist.

i also tried your second invocation. this produced the same error

nilsbecker commented 3 years ago

this is the complete 'lib' subdirectory:

-bash-4.2$ ls -R
.:
cpp                       eigen__Eigen_dsmat_z.cmi   eigen__Eigen_utils.cmt
dune-package              eigen__Eigen_dsmat_z.cmt   eigen__Eigen_utils.cmx
eigen.a                   eigen__Eigen_dsmat_z.cmx   eigen__Ffi_eigen_bindings.cmi
eigen.cma                 eigen__Eigen_spmat_c.cmi   eigen__Ffi_eigen_bindings.cmt
eigen__.cmi               eigen__Eigen_spmat_c.cmt   eigen__Ffi_eigen_bindings.cmx
eigen.cmi                 eigen__Eigen_spmat_c.cmx   eigen__Ffi_eigen_generated.cmi
eigen__.cmt               eigen__Eigen_spmat_d.cmi   eigen__Ffi_eigen_generated.cmt
eigen.cmt                 eigen__Eigen_spmat_d.cmt   eigen__Ffi_eigen_generated.cmx
eigen__.cmx               eigen__Eigen_spmat_d.cmx   eigen__.ml
eigen.cmx                 eigen__Eigen_spmat_s.cmi   eigen.ml
eigen.cmxa                eigen__Eigen_spmat_s.cmt   eigen_spmat_c.ml
eigen.cmxs                eigen__Eigen_spmat_s.cmx   eigen_spmat_d.ml
eigen_dsmat_c.ml          eigen__Eigen_spmat_z.cmi   eigen_spmat_s.ml
eigen_dsmat_d.ml          eigen__Eigen_spmat_z.cmt   eigen_spmat_z.ml
eigen_dsmat_s.ml          eigen__Eigen_spmat_z.cmx   eigen_tensor_d.ml
eigen_dsmat_z.ml          eigen__Eigen_tensor_d.cmi  eigen_tensor_s.ml
eigen__Eigen_dsmat_c.cmi  eigen__Eigen_tensor_d.cmt  eigen_types.ml
eigen__Eigen_dsmat_c.cmt  eigen__Eigen_tensor_d.cmx  eigen_utils.ml
eigen__Eigen_dsmat_c.cmx  eigen__Eigen_tensor_s.cmi  ffi_eigen_bindings.ml
eigen__Eigen_dsmat_d.cmi  eigen__Eigen_tensor_s.cmt  ffi_eigen_generated.ml
eigen__Eigen_dsmat_d.cmt  eigen__Eigen_tensor_s.cmx  libeigen_stubs.a
eigen__Eigen_dsmat_d.cmx  eigen__Eigen_types.cmi     META
eigen__Eigen_dsmat_s.cmi  eigen__Eigen_types.cmt     opam
eigen__Eigen_dsmat_s.cmt  eigen__Eigen_types.cmx
eigen__Eigen_dsmat_s.cmx  eigen__Eigen_utils.cmi

./cpp:
eigen_cpp.ml         eigen_cpp_stubs.cmt   eigen_cpp_stubs__Eigen_cpp.cmi  libeigen_cpp_stubs_stubs.a
eigen_cpp_stubs.a    eigen_cpp_stubs.cmx   eigen_cpp_stubs__Eigen_cpp.cmt
eigen_cpp_stubs.cma  eigen_cpp_stubs.cmxa  eigen_cpp_stubs__Eigen_cpp.cmx
eigen_cpp_stubs.cmi  eigen_cpp_stubs.cmxs  eigen_cpp_stubs.ml
dbuenzli commented 3 years ago

i also tried your second invocation. this produced the same error

No -L to the library directory is provided to the C linker so the -leigen_cpp_stubs_stubs fails.

I likely gave you the wrong invocation you need to include the dir (cf. what dune does)

So try something like:

cd $(opam var lib)/eigen 
ocamlopt -verbose -shared -linkall -I eigen_cpp -o /tmp/nolib.cmxs eigen_cpp/eigen_cpp_stubs.cmxa
ocamlopt -verbose -shared -linkall -I eigen_cpp -o /tmp/withlib.cmxs eigen_cpp/eigen_cpp_stubs.cmxa eigen_cpp/libeigen_cpp_stubs_stubs.a

If that doesn't work force the library directory with -cclib -L$(opam var lib)/eigen/eigen_cpp/.

nilsbecker commented 3 years ago

ok that's better. i changed the subdirectory name: s/eigen_cpp/cpp/, and then ran

-bash-4.2$ ocamlopt -verbose -shared -linkall -I cpp -o /tmp/nolib.cmxs cpp/eigen_cpp_stubs.cmxa
+ as  -o '/tmp/nolib.cmxs.startup.o' '/tmp/camlstartup09fc9d.s'
+ gcc -shared -o '/tmp/nolib.cmxs'  '-Lcpp' '-L/home/beckern/.opam/411/lib/ocaml'  '/tmp/nolib.cmxs.startup.o' 'cpp/eigen_cpp_stubs.a' '-leigen_cpp_stubs_stubs' '-lstdc++'

next i ran

-bash-4.2$ ocamlopt -verbose -shared -linkall -I cpp -o /tmp/withlib.cmxs cpp/eigen_cpp_stubs.cmxa cpp/libeigen_cpp_stubs_stubs.a
+ as  -o '/tmp/withlib.cmxs.startup.o' '/tmp/camlstartupf839e6.s'
+ gcc -shared -o '/tmp/withlib.cmxs'  '-Lcpp' '-L/home/beckern/.opam/411/lib/ocaml'  '/tmp/withlib.cmxs.startup.o' 'cpp/eigen_cpp_stubs.a' '-leigen_cpp_stubs_stubs' '-lstdc++' 'cpp/libeigen_cpp_stubs_stubs.a'

i then ran nm on these two ouput files

-bash-4.2$ nm nolib.cmxs
00000000002011f8 B __bss_start
0000000000000c00 r caml_absf_mask
0000000000000c20 r caml_absf_mask
0000000000000c40 r caml_absf_mask
00000000002011b0 D camlEigen_cpp_stubs
0000000000000bc0 T camlEigen_cpp_stubs__code_begin
0000000000000bc6 T camlEigen_cpp_stubs__code_end
00000000002011a0 D camlEigen_cpp_stubs__data_begin
00000000002011b8 D camlEigen_cpp_stubs__data_end
00000000002011d8 D camlEigen_cpp_stubs__Eigen_cpp
0000000000000bd0 T camlEigen_cpp_stubs__Eigen_cpp__code_begin
0000000000000bd6 T camlEigen_cpp_stubs__Eigen_cpp__code_end
00000000002011c8 D camlEigen_cpp_stubs__Eigen_cpp__data_begin
00000000002011e8 D camlEigen_cpp_stubs__Eigen_cpp__data_end
0000000000000bd0 T camlEigen_cpp_stubs__Eigen_cpp__entry
00000000002011f0 D camlEigen_cpp_stubs__Eigen_cpp__frametable
00000000002011c8 D camlEigen_cpp_stubs__Eigen_cpp__gc_roots
0000000000000bc0 T camlEigen_cpp_stubs__entry
00000000002011c0 D camlEigen_cpp_stubs__frametable
00000000002011a0 D camlEigen_cpp_stubs__gc_roots
0000000000201170 D caml_globals
0000000000000bf0 r caml_negf_mask
0000000000000c10 r caml_negf_mask
0000000000000c30 r caml_negf_mask
0000000000201038 D caml_plugin_header
0000000000000bba T caml_shared_startup__code_begin
0000000000000bba T caml_shared_startup__code_end
0000000000201030 D caml_shared_startup__data_begin
0000000000201190 D caml_shared_startup__data_end
0000000000201198 D caml_shared_startup__frametable
00000000002011f8 b completed.6929
                 w __cxa_finalize@@GLIBC_2.2.5
0000000000000ae0 t deregister_tm_clones
0000000000000b70 t __do_global_dtors_aux
0000000000200e08 t __do_global_dtors_aux_fini_array_entry
0000000000201028 d __dso_handle
0000000000200e10 d _DYNAMIC
00000000002011f8 D _edata
0000000000201200 B _end
0000000000000bd8 T _fini
0000000000000bb0 t frame_dummy
0000000000200e00 t __frame_dummy_init_array_entry
0000000000000ce8 r __FRAME_END__
0000000000201000 d _GLOBAL_OFFSET_TABLE_
                 w __gmon_start__
0000000000000a88 T _init
                 w _ITM_deregisterTMCloneTable
                 w _ITM_registerTMCloneTable
0000000000000b20 t register_tm_clones
00000000002011f8 d __TMC_END__

and

-bash-4.2$ nm withlib.cmxs
00000000002011f8 B __bss_start
0000000000000c00 r caml_absf_mask
0000000000000c20 r caml_absf_mask
0000000000000c40 r caml_absf_mask
00000000002011b0 D camlEigen_cpp_stubs
0000000000000bc0 T camlEigen_cpp_stubs__code_begin
0000000000000bc6 T camlEigen_cpp_stubs__code_end
00000000002011a0 D camlEigen_cpp_stubs__data_begin
00000000002011b8 D camlEigen_cpp_stubs__data_end
00000000002011d8 D camlEigen_cpp_stubs__Eigen_cpp
0000000000000bd0 T camlEigen_cpp_stubs__Eigen_cpp__code_begin
0000000000000bd6 T camlEigen_cpp_stubs__Eigen_cpp__code_end
00000000002011c8 D camlEigen_cpp_stubs__Eigen_cpp__data_begin
00000000002011e8 D camlEigen_cpp_stubs__Eigen_cpp__data_end
0000000000000bd0 T camlEigen_cpp_stubs__Eigen_cpp__entry
00000000002011f0 D camlEigen_cpp_stubs__Eigen_cpp__frametable
00000000002011c8 D camlEigen_cpp_stubs__Eigen_cpp__gc_roots
0000000000000bc0 T camlEigen_cpp_stubs__entry
00000000002011c0 D camlEigen_cpp_stubs__frametable
00000000002011a0 D camlEigen_cpp_stubs__gc_roots
0000000000201170 D caml_globals
0000000000000bf0 r caml_negf_mask
0000000000000c10 r caml_negf_mask
0000000000000c30 r caml_negf_mask
0000000000201038 D caml_plugin_header
0000000000000bba T caml_shared_startup__code_begin
0000000000000bba T caml_shared_startup__code_end
0000000000201030 D caml_shared_startup__data_begin
0000000000201190 D caml_shared_startup__data_end
0000000000201198 D caml_shared_startup__frametable
00000000002011f8 b completed.6929
                 w __cxa_finalize@@GLIBC_2.2.5
0000000000000ae0 t deregister_tm_clones
0000000000000b70 t __do_global_dtors_aux
0000000000200e08 t __do_global_dtors_aux_fini_array_entry
0000000000201028 d __dso_handle
0000000000200e10 d _DYNAMIC
00000000002011f8 D _edata
0000000000201200 B _end
0000000000000bd8 T _fini
0000000000000bb0 t frame_dummy
0000000000200e00 t __frame_dummy_init_array_entry
0000000000000ce8 r __FRAME_END__
0000000000201000 d _GLOBAL_OFFSET_TABLE_
                 w __gmon_start__
0000000000000a88 T _init
                 w _ITM_deregisterTMCloneTable
                 w _ITM_registerTMCloneTable
0000000000000b20 t register_tm_clones
00000000002011f8 d __TMC_END__

is this what you were looking for? the outputs are actually the same!

dbuenzli commented 3 years ago

Well it seems the behaviour of the linker is consistent with macOS's one but I don't know how to convince it the same way to force the link (I thought maybe that was businness for -no-as-needed, but reading the docs it doesn't seem so). In any case the docs in the manual for ocamlc -shared is quite misleading, since it claims .a files you'll put on the cli will get there.

The problem may be (?) that nothing mentions the stubs as externals in what gets in the .cmxa, because for example unix.cmxs on my platform has the stubs and it is generated the same way, but these modules in unix.cmxa do mention the stubs directly. But given the dune namespacing business overlaid with the ctypes busines it's not evident to me what exactly gets into that .cmxa. Maybe a dune/ctypes expert can look into that.

nojb commented 3 years ago

The problem may be (?) that nothing mentions the stubs as externals in what gets in the .cmxa, because for example unix.cmxs on my platform has the stubs and it is generated the same way, but these modules in unix.cmxa do mention the stubs directly. But given the dune namespacing business overlaid with the ctypes busines it's not evident to me what exactly gets into that .cmxa. Maybe a dune/ctypes expert can look into that.

This is exactly right, thanks for bringing it up -- I had forgotten about it. Indeed if the externals are not mentioned they will typically not be linked in.

For the case at hand, the solution is easy. As mentioned by Daniel, the problem is that you are creating an empty OCaml library (eigen.cpp), just to carry the C bindings. As no binding are mentioned in the source, they end up not being included in the resulting .cmxs. The solution is to create a pure C library instead and link that into the main OCaml library (eigen). Since this one does mention all the bindings, they will get linked in without any issue. I gave it a try (diff below) and it seemed to work. It would be great if you could confirm @nilsbecker.

diff --git a/eigen/dune b/eigen/dune
index 8d7a016..f8e1e56 100644
--- a/eigen/dune
+++ b/eigen/dune
@@ -31,13 +31,14 @@ let () = Printf.ksprintf Jbuild_plugin.V1.send {|
 (library
  (name eigen)
  (public_name eigen)
- (libraries ctypes eigen.cpp)
+ (libraries ctypes)
  (modules :standard ffi_eigen_generated)
+ (foreign_archives ../eigen_cpp/eigen_cpp_stubs)
  (foreign_stubs
   (language c)
   (names eigen_utils_stubs ffi_eigen_generated_stub)
-  (flags :standard %s -I../eigen_cpp/lib)
+  (include_dirs ../eigen_cpp/lib)
+  (flags :standard %s)
   )
- (c_library_flags :standard)
 )
 |} eigen_flags
diff --git a/eigen_cpp/dune b/eigen_cpp/dune
index 17cad9e..6ac18a1 100644
--- a/eigen_cpp/dune
+++ b/eigen_cpp/dune
@@ -16,26 +16,21 @@ let () = Printf.ksprintf Jbuild_plugin.V1.send {|

 (include_subdirs unqualified)

-(library
- (name eigen_cpp_stubs)
- (public_name eigen.cpp)
- (libraries ctypes)
- (foreign_stubs
-  (language cxx)
-  (names eigen_tensor eigen_dsmat eigen_spmat)
-  (include_dirs lib lib/unsupported)
-  (flags
-    :standard
-    -fPIC
-    -ansi
-    -pedantic
-    -O3
-    -std=c++11
-    %s
-    %s
-   )
+(foreign_library
+ (archive_name eigen_cpp_stubs)
+ (language cxx)
+ (names eigen_tensor eigen_dsmat eigen_spmat)
+ (include_dirs lib lib/unsupported)
+ (flags
+   :standard
+   -fPIC
+   -ansi
+   -pedantic
+   -O3
+   -std=c++11
+   %s
+   %s
   )
- (c_library_flags :standard -lstdc++)
  )

 |} devflags optflags
nilsbecker commented 3 years ago

thanks @nojb i was trying to apply your patch by copy and pasting it and then from a local clone of the eigen repo, doing git apply thepatch but that complained about a corrupt patch filed. i must be doing something wrong?

nojb commented 3 years ago

thanks @nojb i was trying to apply your patch by copy and pasting it and then from a local clone of the eigen repo, doing git apply thepatch but that complained about a corrupt patch filed. i must be doing something wrong?

Should work either way. Check out https://github.com/nojb/eigen

nilsbecker commented 3 years ago

ok, checking out @nojb's clone worked. i then removed the stubs line from @dbuenzli load.ml test program:

let () =
  let root = Sys.argv.(1) in
  let load cmxs = Dynlink.loadfile (root ^ "/" ^ cmxs) in
  Dynlink.allow_unsafe_modules true;
  try
    load "integers/integers.cmxs";
    load "ctypes/ctypes.cmxs";
    (*load "eigen/cpp/eigen_cpp_stubs.cmxs";*)
    load "eigen/eigen.cmxs";
    print_endline "Okay!";
    exit 0
  with
  | Dynlink.Error e -> print_endline (Dynlink.error_message e); exit 2

and ran

ocamlopt -linkall dynlink.cmxa load.ml
 ./a.out $(opam var lib)
Okay!

so this seems to be able to load eigen.cmxs. this is on a macos machine without the native toplevel installed, so i did not test that.

nilsbecker commented 3 years ago

i retried the same on the linux machine. cloned into nojb/eigen, opam pinned eigen to that directory, let opam recompile eigen. then

-bash-4.2$ ocamlopt -linkall dynlink.cmxa load.ml
-bash-4.2$ ./a.out $(opam var lib)
error loading shared library: Dynlink.Error (Dynlink.Cannot_open_dll "Failure(\"/home/beckern/.opam/411/lib/eigen/eigen.cmxs: undefined symbol: _ZTVN10__cxxabiv117__class_type_infoE\")")

with the same shortened load.ml file. ?!?!?

nilsbecker commented 3 years ago

i should add, the macos machine has ocaml 4.10, the linux machine 4.11.

nilsbecker commented 3 years ago

i installed a fresh 4.11 switch on the macos machine and tried the same. this also succeeds.

dbuenzli commented 3 years ago

On 10 November 2020 at 17:08:00, nilsbecker (notifications@github.com) wrote:

error loading shared library: Dynlink.Error (Dynlink.Cannot_open_dll "Failure(\"/home/beckern/.opam/411/lib/eigen/eigen.cmxs:
undefined symbol: _ZTVN10cxxabiv117class_type_infoE\")")

This seems to be a libstdc++ thing. Is maybe the c++ lib install structure different on linux and macos ? 

Try to ldd (on macos use otool -L) the cmxs to see if the libstdc++ library is linked also check the cmxa with ocamlobjinfo that it has the -lstdc++. 

Daniel

nilsbecker commented 3 years ago

i'll try. first another data point: on macos everything seems to be working. on another macos machine, this one with a switch that has ocamlnat, it was able to load the patched eigen with Omod. nice! (i guess this was expected)

nilsbecker commented 3 years ago

and even Omod.load "Owl" succeeds (!)

nojb commented 3 years ago

My patch may be responsible, try adding back the (c_library_flags :standard -lstdc++) field back into the library stanza, as in https://github.com/nojb/eigen/commit/8de3f768c43603184f6ec336d6b39475e87217b5

nilsbecker commented 3 years ago

on the linux machine:

-bash-4.2$ ldd eigen.cmxs
ldd: warning: you do not have execution permission for `./eigen.cmxs'
    linux-vdso.so.1 =>  (0x00007ffe58fad000)
    libgcc_s.so.1 => /software/gcc/7.2.0/lib64/libgcc_s.so.1 (0x00007f36ac4fd000)
    libc.so.6 => /lib64/libc.so.6 (0x00007f36ac11d000)
    /lib64/ld-linux-x86-64.so.2 (0x000055b7491ed000)

so no libstdc++ ?

nilsbecker commented 3 years ago

aha! the patched patch now works on linux. and:

-bash-4.2$ ldd eigen.cmxs
ldd: warning: you do not have execution permission for `./eigen.cmxs'
    linux-vdso.so.1 =>  (0x00007ffc9fde5000)
    libstdc++.so.6 => /software/gcc/7.2.0/lib64/libstdc++.so.6 (0x00007f2664e05000)
    libgcc_s.so.1 => /software/gcc/7.2.0/lib64/libgcc_s.so.1 (0x00007f2664bed000)
    libc.so.6 => /lib64/libc.so.6 (0x00007f266480d000)
    libm.so.6 => /lib64/libm.so.6 (0x00007f2664505000)
    /lib64/ld-linux-x86-64.so.2 (0x0000560a34948000)

so this last patch seems to do it.

nilsbecker commented 3 years ago

i'll report this back to the owl folks. but a more general question would be if problems of this kind could be somehow spotted directly when libraries are published? could there be some automatic testing that dynlink works? or some way to help library developers test for that? or can dune automatically do everything right (TM) ?

nilsbecker commented 3 years ago

incidentally, i don't know how to set up ocaml 4.11 to install ocamlnat. it seems the makefile organisation in the distribution has changed. does anybody here know?

dbuenzli commented 3 years ago

@nilsbecker congratulations, you are now an OCaml linker expert.

or can dune automatically do everything right (TM) ?

Honestly I think the OCaml compiler is not helping here.

I think that in the cmxa -> cmxs compiler path should instruct the C linker so that any C library mentioned in the .cmxa either introduces a dynamic dependency (if the library resolves to a .so) or fully statically links the library (if the library resolve to an .a). For plugins linking anything "as needed" is just absurd.

It seems on macOS we can do this, I'm unsure what can done for gcc and the menagerie of windows ports.

I'll let @nojb comment whether this is worth opening an issue upstream.

nojb commented 3 years ago

I'll let @nojb comment whether this is worth opening an issue upstream.

Sure! Having a summary of what the problem is would be useful, since it is something that comes up periodically.

dbuenzli commented 3 years ago

https://github.com/ocaml/ocaml/issues/10018

nojb commented 3 years ago

Thanks!

On that note, I will close this issue, re-open if needed.

mseri commented 3 years ago

Looks like the fix for the cmxs broke the bytecode builds, see https://github.com/owlbarn/eigen/issues/29. Is there a way to have both working?

EDIT: seems to work if I move -lstdc++ into flags instead of c_library_flags. But I am confused now, why also the link flags are going into flags? I thought there was a distinction between the preprocessor/compiler flags and the link flags

mseri commented 3 years ago

Can we reopen this? Moving the flags has broken other things. Looks like the current behaviour of dune with C++ foreign libraries is very fragile.

nojb commented 3 years ago

Can we reopen this? Moving the flags has broken other things. Looks like the current behaviour of dune with C++ foreign libraries is very fragile.

cc @snowleopard

snowleopard commented 3 years ago

Thanks for tagging me @nojb!

This is a long thread, which seems to have explored a couple of dead-ends. Since you've been involved in it from the start, could you summarise what was discovered? Is there a consensus about what needs to be done to make Dune's support fo C++ foreign stubs less fragile?

nojb commented 3 years ago

This is a long thread, which seems to have explored a couple of dead-ends. Since you've been involved in it from the start, could you summarise what was discovered? Is there a consensus about what needs to be done to make Dune's support fo C++ foreign stubs less fragile?

Sure! This is my understanding of the current situation. @nilsbecker / @mseri will correct me if I'm wrong.

The challenge is to build the library eigen of https://github.com/owlbarn/eigen which uses some C++ bindings in a way that can works on bytecode and native and it is possible load it using Dynlink (with the stubs statically linked into the .cmxs) on Linux and macOS and ideally Windows as well.

Currently, there is a partial solution that puts the C++ code in a foreign_library called eigen_cpp_stubs, and the main OCaml library eigen links with it using foreign_archive. However, there seems to be a problem passing of -lstdc++. I believe the following two possibilities were tried:

So a good first step would be to figure out if there is a way to make everything work at once.

mseri commented 3 years ago

In fact, it is a bit worse than that: adding c_library_flags :standard -lstdc++ with the current setup does not work for native compilation either (see the run here: https://github.com/owlbarn/eigen/runs/1417385969). I had to modify the CI to use a "two-level" dependency to find it out.

So the old mechanism of having a dummy eigen.cpp library which includes c_library_flags :standard -lstdc++ works for native but breaks bytecode and cmxs.

The fix for .cmxs/Dynlink breaks the native compilation.