anshuman23 / tensorflex

Tensorflow bindings for the Elixir programming language :muscle:
https://hexdocs.pm/tensorflex/Tensorflex.html
Apache License 2.0
308 stars 14 forks source link

Failed to load NIF library #30

Closed whitered closed 6 years ago

whitered commented 6 years ago

I'm trying to make tensorflex work according to new installation way (through hex package). When I first try to use the library in my project, I get this error:

23:06:09.125 [warn]  The on_load function for module Elixir.Tensorflex.NIFs returned:
{:error, {:load_failed, 'Failed to load NIF library: \'dlopen(priv/Tensorflex.so, 2): image not found\''}}

Looks like elixir expects to find .so file in my priv/ directory, not in deps/tensorflex/priv/. Ok, let's try to bring this file here: cp deps/tensorflex/priv/Tensorflex.so ./priv

Now the error looks like this:

23:15:09.919 [warn]  The on_load function for module Elixir.Tensorflex.NIFs returned:
{:error, {:load_failed, 'Failed to load NIF library: \'dlopen(priv/Tensorflex.so, 2): no suitable image found.  Did find:\n\tpriv/Tensorflex.so: unknown file type, first eight bytes: 0x7F 0x45 0x4C 0x46 0x02 0x01 0x01 0x00\n\t/Users/dmitryzhelnin/projects/pallium/pallium/priv/Tensorflex.so: unknown file type, first eight bytes: 0x7F 0x45 0x4C 0x46 0x02 0x01 0x01 0x00\''}}

Here is another approach I was trying to make work. If I try rebuild the .so file with make:

rm -rf deps/tensorflex/priv/
rm -rf priv
make -C deps/tensorflex/
cp deps/tensorflex/priv/Tensorflex.so priv

and I get another error:

2018-07-31 23:19:56.355019: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.2 AVX AVX2 FMA
                                                                                  Segmentation fault: 11

Also if I just clone tensorflex repository and try to run tests in it, it just fails with same segmantation fault after few tests succeeded:

..........Segmentation fault: 11

I'm using macOS High Sierra 10.13.4. Tensorflow seems to be installed properly and works fine with Extensor library, but we have strong wish to switch to Tensorflex, so any help will be appreciated

anshuman23 commented 6 years ago

@whitered I am sorry to hear that you are facing problems. I will try and get this to work for you.

Thank you for letting me know, I have reproduced the error, but I have not solved it yet. I will ping you here once that is done. Apologies for the inconvenience. However, for me the error doesn't pop up in the first go after the dependencies are obtained. So Tensorflex runs fine then.

Also copying over the priv folder to the current project's directory then fixes everything unlike in your case.

So could you try one more thing? Can you remove the deps, re-run mix deps.get and then as soon as that is done, run iex -S mix and then before doing anything else, just try and run the Tensorflex functions? Maybe try reading a graph in and listing out all the operations? Let me know what output you get?

anshuman23 commented 6 years ago

@whitered What is the version of your Tensorflow C API?

whitered commented 6 years ago

@anshuman23 , thank you for fast response Here is the output of reading graph in iex:

rm -rf deps
rm -rf priv
mix deps.get
iex -S mix

iex(1)> Tensorflex.read_graph "./examples/add.pb"

00:48:54.668 [warn]  The on_load function for module Elixir.Tensorflex.NIFs returned:
{:error, {:load_failed, 'Failed to load NIF library: \'dlopen(priv/Tensorflex.so, 2): image not found\''}}

** (UndefinedFunctionError) function Tensorflex.NIFs.read_graph/1 is undefined (module Tensorflex.NIFs is not available)
    (tensorflex) Tensorflex.NIFs.read_graph("./examples/add.pb")
    (tensorflex) lib/tensorflex.ex:110: Tensorflex.read_graph/1

Another attempt with copying Tensorflex.so:

rm -rf deps
rm -rf priv
mix deps.get
mkdir priv
cp deps/tensorflex/priv/Tensorflex.so priv/
iex -S mix

iex(1)> Tensorflex.read_graph "./examples/add.pb"

00:53:29.517 [warn]  The on_load function for module Elixir.Tensorflex.NIFs returned:
{:error, {:load_failed, 'Failed to load NIF library: \'dlopen(priv/Tensorflex.so, 2): no suitable image found.  Did find:\n\tpriv/Tensorflex.so: unknown file type, first eight bytes: 0x7F 0x45 0x4C 0x46 0x02 0x01 0x01 0x00\n\tpriv/Tensorflex.so: stat() failed with errno=35\''}}

** (UndefinedFunctionError) function Tensorflex.NIFs.read_graph/1 is undefined (module Tensorflex.NIFs is not available)
    (tensorflex) Tensorflex.NIFs.read_graph("./examples/add.pb")
    (tensorflex) lib/tensorflex.ex:110: Tensorflex.read_graph/1
whitered commented 6 years ago

sorry for accidental closing issue

whitered commented 6 years ago

While trying to figure out my tensorflow C API as described here https://www.tensorflow.org/install/install_c, I have found that my installation is probably broken. Executing gcc hello_tf.c runs without errors, but output in a.out looks crappy:

H__PAGEZERO__TEXT__text__TEXT 4 __stubs__TEXTT
                                               __stub_helper__TEXT`$`__cstring__TEXT,__unwind_info__TEXTH__DATA__nl_symbol_ptr__DATA__la_symbol_ptr__DATAH__LINKEDIT  ("    @ 0x  H
                                                                             P
                                                                                /usr/lib/dyld{i.D4{e\$

*(
   0@rpath/libtensorflow.so
                           82/usr/lib/libSystem.B.dylib&p)x UHHEH=GHư1ɉEH]%%LAS%hhHello from TensorFlow C library version %s
 44U4
     pz"R@dyld_stub_binderQrr@_TF_Versionr@_printf__mh_execute_header!main% (0@ __mh_execute_header_main_TF_Version_printfdyld_stub_binder

I will try to solve this tomorrow

anshuman23 commented 6 years ago

Yes that is why you are probably getting the segfaults. However, the fix for the issue you were encountering has been made.

Can you please remove the dependencies once again and try now by adding the dependency to your mix.exs this way:

{:tensorflex, github: "anshuman23/tensorflex"}
whitered commented 6 years ago

Now everything works great! Thank you a lot!