TheFoundryVisionmongers / nuke-ML-server

A Nuke client plug-in which connects to a Python server to allow Machine Learning inference in Nuke.
Apache License 2.0
134 stars 36 forks source link

undefined symbol error when loading plugin (Nuke 11.2v3) #15

Open fcole opened 5 years ago

fcole commented 5 years ago

First of all, thanks for this plugin! I managed to build and install it (I think), but when I try to drop the node into Nuke I get the following error:

/usr/local/home/fcole/.nuke/MLClient.so: undefined symbol: _ZNK2DD5Image2Op15input_longlabelB5cxx11Ei

I checked the build paths and it does seem like it is built against the Nuke version I am running (11.2v3). I've had a couple other Nuke installs on this machine, though, so wondering if this error could be caused by finding a stale library somewhere.

ringdk commented 5 years ago

Hello! Glad you're enjoying it.

I think you're correct that it's linked against the wrong library. Running 'ldd' on the MLClient.so might give you a hint at the library it linked against during compilation. You could also try running 'make' in verbose mode (i.e. 'make VERBOSE=1') to print out the linker command to see what paths are being included. Hope that helps!

fcole commented 5 years ago

Hmm, after further inspection it seems to be loading the right library, but the library indeed does not seem to have the symbol it's looking for. If I do

nm -D /usr/local/Nuke11.2v3/libDDImage.so

I get output that looks like:

00000000001f88b0 T _ZN2DD5Image2Op14set_unlicensedEv 0000000000344d20 T _ZN2DD5Image2Op15add_draw_handleEPNS0_13ViewerContextE 00000000003455a0 T _ZN2DD5Image2Op15add_knob_handleEPNS0_4KnobEPNS0_13ViewerContextE 000000000024b630 T _ZN2DD5Image2Op15anyInputHandlesEPNS0_13ViewerContextE 000000000024d230 T _ZN2DD5Image2Op15disallowNoTreesEv 000000000024d240 T _ZN2DD5Image2Op15isTimingEnabledEv 000000000011e8d0 W _ZN2DD5Image2Op15pre_write_knobsEv 000000000024c7c0 T _ZN2DD5Image2Op15progressDismissEv 000000000024c570 T _ZN2DD5Image2Op15progressMessageEPKcz 0000000000a10770 D _ZN2DD5Image2Op15status_callbackE 000000000024a5b0 T _ZN2DD5Image2Op16add_input_handleEiPNS0_13ViewerContextE

So similar symbols are there, but

nm -D /usr/local/Nuke11.2v3/libDDImage.so | grep _ZNK2DD5Image2Op15input_longlabelB5cxx11Ei

returns nothing.

Can you confirm the plugin should work with Nuke 11.2v3?

fcole commented 5 years ago

Ok, I figured out the issue. I guess Nuke 11.2v3 was compiled with the old, non-cxx11 ABI. To fix this error, I had to recompile both protobuf and the client with '-D_GLIBCXX_USE_CXX11_ABI=0'.

ringdk commented 5 years ago

Ah interesting, thanks for persevering, that makes more sense now!

So it seems you shouldn't be able to build plugins without that macro, which is likely because you aren't building on CentOS or RHEL 6/7? The compilers on these platforms by default define this macro, so we don't do it explicitly. This is a libstdc++ thing, which the vfx platform still keeps for backward compatibility. A new ABI was introduced in gcc 5.x but the VFX platform explicitly wants us to keep the old ABI, which is the default on the supported linux platforms. See https://vfxplatform.com/#footnote-gcc6 for more info.

On Wed, Jul 17, 2019 at 9:23 PM fcole notifications@github.com wrote:

Ok, I figured out the issue. I guess Nuke 11.2v3 was compiled with the old, non-cxx11 ABI. To fix this error, I had to recompile both protobuf and the client with '-D_GLIBCXX_USE_CXX11_ABI=0'.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/TheFoundryVisionmongers/nuke-ML-server/issues/15?email_source=notifications&email_token=AAIH2BYPNN26RRG74ZIN6O3P755UFA5CNFSM4IDYBRC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2GPIIQ#issuecomment-512554018, or mute the thread https://github.com/notifications/unsubscribe-auth/AAIH2B4F633XZYLC7SNIO6TP755UFANCNFSM4IDYBRCQ .

UweMajer commented 4 years ago

Hi, I also ran into the same issue using ubuntu 18.04 LTS. Where exactly do I need to paste that line of code to fix that gcc problem?

Thank you, Uwe

Ok, I figured out the issue. I guess Nuke 11.2v3 was compiled with the old, non-cxx11 ABI. To fix this error, I had to recompile both protobuf and the client with '-D_GLIBCXX_USE_CXX11_ABI=0'.

ringdk commented 4 years ago

Hi @UweMajer ,

I haven't tested it, but you can add the definition in the 'nuke-ML-server/CMakeLists.txt':

add_definitions(-D_GLIBCXX_USE_CXX11_ABI=0)

(from: https://stackoverflow.com/questions/50867365/what-is-difference-between-add-definitions-and-set-in-cmake-file, more info here: https://blog.conan.io/2016/03/22/From-CMake-syntax-to-libstdc++-ABI-incompatibiliy-migrations-are-always-hard.html)

feixels commented 4 years ago

for me it did not work to add the codeline provided by @ringdk , but i managed to get it running anyway on Ubuntu 18.04.3 LTS. i installed gcc-4.8.5 and used it instead of the newer gcc which was preinstalled. recompiled protobuf and the mlserver.so, and it worked fine. (remove protobuf before installing protobuf compiled with older gcc) my post might be useless for experienced users, but for me as a noob this would have helped so I thought I just post it.

best regards, felix

flipphillips commented 4 years ago

So I did a recompile ala @fcole under 12.2 and continue to see the same problem. One thing I didn't check is if the definitions 'survived' the configure since I notice that it also goes in and does some ABI related foolishness.

Next I suppose would be to build a docker / roll back my gcc ala @sprnglf ... ugh.