rrwick / Unicycler

hybrid assembly pipeline for bacterial genomes
GNU General Public License v3.0
566 stars 131 forks source link

call to uniclycler fail when loading C++ functions #42

Closed flass closed 7 years ago

flass commented 7 years ago

Hi, I just installed Uniclycler using the following options:

python3 setup.py install --prefix=$HOME/.local --makeargs "CXX=/share/apps/gcc-6.2.0/bin/g++"

Now when I try a simple unicycler -h, here is what I get (not nice):

Traceback (most recent call last):
  File "/home/ucbtass/bin/unicycler", line 11, in <module>
    load_entry_point('unicycler==0.4.0', 'console_scripts', 'unicycler')()
  File "/share/apps/python-3.4.2/lib/python3.4/site-packages/pkg_resources/__init__.py", line 560, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
  File "/share/apps/python-3.4.2/lib/python3.4/site-packages/pkg_resources/__init__.py", line 2648, in load_entry_point
    return ep.load()
  File "/share/apps/python-3.4.2/lib/python3.4/site-packages/pkg_resources/__init__.py", line 2302, in load
    return self.resolve()
  File "/share/apps/python-3.4.2/lib/python3.4/site-packages/pkg_resources/__init__.py", line 2308, in resolve
    module = __import__(self.module_name, fromlist=['__name__'], level=0)
  File "/home/ucbtass/.local/lib/python3.4/site-packages/unicycler/unicycler.py", line 24, in <module>
    from .assembly_graph import AssemblyGraph
  File "/home/ucbtass/.local/lib/python3.4/site-packages/unicycler/assembly_graph.py", line 20, in <module>
    from .assembly_graph_segment import Segment
  File "/home/ucbtass/.local/lib/python3.4/site-packages/unicycler/assembly_graph_segment.py", line 19, in <module>
    from .bridge_long_read import LongReadBridge
  File "/home/ucbtass/.local/lib/python3.4/site-packages/unicycler/bridge_long_read.py", line 27, in <module>
    from .path_finding import get_best_paths_for_seq
  File "/home/ucbtass/.local/lib/python3.4/site-packages/unicycler/path_finding.py", line 23, in <module>
    from .cpp_wrappers import fully_global_alignment, path_alignment
  File "/home/ucbtass/.local/lib/python3.4/site-packages/unicycler/cpp_wrappers.py", line 28, in <module>
    C_LIB = CDLL(SO_FILE_FULL)
  File "/share/apps/python-3.4.2/lib/python3.4/ctypes/__init__.py", line 351, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by /home/ucbtass/.local/lib/python3.4/site-packages/unicycler/cpp_functions.so)

any idea to fix this? Many thanks F

rrwick commented 7 years ago

This feels similar to issue #44 which I just commented on. As I said in that thread, I'm not too savvy with this sort of thing, so I may struggle to troubleshoot.

Could you try the suggestions I gave in #44? Specifically, clone Unicycler and run make to verify that the compilation is fine. Then try to run it without installation. If that all works, then the problem may be with the installation itself.

Ryan

flass commented 7 years ago

Hi Ryan, I tried rebuilding from scratch the C++ programs by cloning and doing make CXX=/share/apps/gcc-5.2.0/bin/g++ and that gave me an error at this step:

/share/apps/gcc-5.2.0/bin/g++ -std=c++14 -Iunicycler/include -fPIC -lrt -lpthread -O3 -DNDEBUG -Wall -Wextra -pedantic -march=native -shared -lz -Wl,-soname,unicycler/cpp_functions.so -o unicycler/cpp_functions.so unicycler/src/consensus_align.o unicycler/src/global_align.o unicycler/src/kmers.o unicycler/src/miniasm/asg.o unicycler/src/miniasm/asm.o unicycler/src/miniasm/dotter.o unicycler/src/miniasm/hit.o unicycler/src/miniasm/paf.o unicycler/src/miniasm/sdict.o unicycler/src/miniasm/sys.o unicycler/src/miniasm_assembly.o unicycler/src/minimap/bseq.o unicycler/src/minimap/index.o unicycler/src/minimap/kthread.o unicycler/src/minimap/map.o unicycler/src/minimap/misc.o unicycler/src/minimap/sdust.o unicycler/src/minimap/sketch.o unicycler/src/minimap_align.o unicycler/src/overlap_align.o unicycler/src/path_align.o unicycler/src/random_alignments.o unicycler/src/ref_seqs.o unicycler/src/scoredalignment.o unicycler/src/scrub.o unicycler/src/semi_global_align.o unicycler/src/semi_global_align_exhaustive.o unicycler/src/start_end_align.o unicycler/src/string_functions.o
/usr/bin/ld: unrecognized option '-plugin'
/usr/bin/ld: use the --help option for usage information
collect2: error: ld returned 1 exit status
make: *** [unicycler/cpp_functions.so] Error 1

thus indeed hinting at dynamic linking...

Interestingly, if I redo that with a later version of g++ make CXX=/share/apps/gcc-6.2.0/bin/g++, I have no error during the make. I note that with -lz option specified it should be searching for libz.so in my lib path, and I don't think it should be able to find it by itself.

So I also tried with indicating the path to libz.so using make CXX=/share/apps/gcc-6.2.0/bin/g++ CXXFLAGS=-L/share/apps/zlib-1.2.8/lib/libz.so

In all cases (with or without linking error, and with or without specifying the path to libz.so), if I try to run the script after building the C++ functions, I get the same error than previously reportedreported:

python3 unicycler-runner.py
Traceback (most recent call last):
  File "unicycler-runner.py", line 18, in <module>
    from unicycler.unicycler import main
  File "/home/ucbtass/Programs/Unicycler/unicycler/unicycler.py", line 24, in <module>
    from .assembly_graph import AssemblyGraph
  File "/home/ucbtass/Programs/Unicycler/unicycler/assembly_graph.py", line 20, in <module>
    from .assembly_graph_segment import Segment
  File "/home/ucbtass/Programs/Unicycler/unicycler/assembly_graph_segment.py", line 19, in <module>
    from .bridge_long_read import LongReadBridge
  File "/home/ucbtass/Programs/Unicycler/unicycler/bridge_long_read.py", line 27, in <module>
    from .path_finding import get_best_paths_for_seq
  File "/home/ucbtass/Programs/Unicycler/unicycler/path_finding.py", line 23, in <module>
    from .cpp_wrappers import fully_global_alignment, path_alignment
  File "/home/ucbtass/Programs/Unicycler/unicycler/cpp_wrappers.py", line 28, in <module>
    C_LIB = CDLL(SO_FILE_FULL)
  File "/share/apps/python-3.4.2/lib/python3.4/ctypes/__init__.py", line 351, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by /home/ucbtass/Programs/Unicycler/unicycler/cpp_functions.so)
rrwick commented 7 years ago

Hi Florent,

I've just released v0.4.1 which among other bug fixes, changes the order of flags in the Makefile (d8884bd74ef7e5315a58c84f29ace115bb5dc905). This changed fixed some similar issues dicussed here: #18. Can you try this most recent version and see if that helps?

Ryan

flass commented 7 years ago

Hi Ryan, unfortunately, it did not fix the bug; I tried calling the program after doing a git pull, make distclean and make, and also wiping out the folder and re-installing from scratch, both with or without the make argument CXXFLAGS=-L/share/apps/zlib-1.2.8/lib/libz.so. In all cases I still hit the same bug.

shyamrallapalli commented 7 years ago

Hello both, I came across a similar problem and here is what I understand so far. ( i am not an expert by the way)

I have Anaconda 4.2 python distribution. It has gcc version 4.8.5. So I have provided CXX path for gcc 4.9.3. C++ files failed to compile with a numeric limits issue (which I have fixed using google - will open a new issue on it later - #48). So now when I call unicycler, it fails with following error

OSError: anaconda/4.2/anaconda3/bin/../lib/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by Unicycler-0.4.1/unicycler/cpp_functions.so)

if I run strings on the libstdc++.so.6 it is looking at

strings anaconda/4.2/anaconda3/bin/../lib/libstdc++.so.6 | grep CXXABI
CXXABI_1.3
CXXABI_1.3.1
CXXABI_1.3.2
CXXABI_1.3.3
CXXABI_1.3.4
CXXABI_1.3.5
CXXABI_1.3.6
CXXABI_1.3.7
CXXABI_TM_1

anaconda libstdc++.so.6 doesn't have CXXABI_1.3.8 because I had used CXX path to gcc 4.9.3 and it does have CXXABI_1.3.8

strings gcc/4.9.3/lib/libstdc++.so.6 | grep CXXABI
CXXABI_1.3
CXXABI_1.3.1
CXXABI_1.3.2
CXXABI_1.3.3
CXXABI_1.3.4
CXXABI_1.3.5
CXXABI_1.3.6
CXXABI_1.3.7
CXXABI_1.3.8
CXXABI_TM_1

so fix would be

  1. replace libstdc++.so.6 in anaconda lib with the one from gcc 4.9.3 ( i think it is dirty fix and i won't be able to do it on our hpc)
  2. Somehow make Python look in the gcc lib path as well
  3. Install anaconda with gcc 4.9.1 and install unicycler

@flass looks like you have the same issue as unicycler is looking at /usr/lib64/libstdc++.so.6 while it should also look at /share/apps/gcc-6.2.0/lib/ as well

let me know, if I am mistaken

flass commented 7 years ago

Hi Shyam,

Thanks, it seems you've nailed the core of the problem. Using strings to test the level of versions of GLIBCXX supported in the different libstdc++ used, it appears that /usr/lib64/libstdc++.so.6 has suport up to GLIBCXX_3.4.13 while /share/apps/gcc-6.2.0/lib/libstdc++.so.6 has support up to GLIBCXX_3.4.22 (when GLIBCXX_3.4.20 seems to be required).

Regarding your method of fix, I think it would be simpler for step 1) to just make the Unicyler program independently (i.e. without using python installer). The node of the solution would be in step 2) in telling python to look into the right path to link to the dynamic lib when executing the program. Here I am completely making hypotheses as i am not pro either, but maybe this can be indicated in one of the Python modules? it seems that cpp_wrappers.py specifies the C lib to use: (excerpt from stack of error message above)

   File "/home/ucbtass/Programs/Unicycler/unicycler/cpp_wrappers.py", line 28, in <module>
    C_LIB = CDLL(SO_FILE_FULL)
rrwick commented 7 years ago

Thanks @shyamrallapalli and @flass for the help figuring this one out! You both say you're not experts, but neither am I for this C++ linking stuff. It's the blind leading the blind :smile:

I'm also not an expert in Anaconda, but to me the problem seems to be that you're building Unicycler in an environment without GCC 4.9 (but successfully building because you're manually giving a compiler path) and then trying to run it in that environment. So I think the most elegant solution is Shyam's 3rd option: get GCC 4.9 in Anaconda. If that's all in place, then you should be able to build it and run it in there.

Or alternatively, skip Anaconda all together - that's what I do. You don't have to install Unicycler to run it (instructions here).

However, Florent, you didn't mention anything about Anaconda. Are you using it?

Here's another possible solution: could you set the LD_LIBRARY_PATH environment variable to point to the directory with the correct version of the library (e.g. /share/apps/gcc-6.2.0/lib/)?

flass commented 7 years ago

Hi Ryan, Shyam, no I'm not using Anaconda, but I face the same problem that those using it, that is that it leads python to look onto a defined path for linked libs which does not cover the one with which the C shared object has been built. I tried to explicitely make the C shared object while specifying the right LD_LIBRARY_PATH, using: make CXX="LD_LIBRARY_PATH=/share/apps/gcc-6.2.0/lib/ /share/apps/gcc-6.2.0/bin/g++", which will call the complier in the follwing way:

LD_LIBRARY_PATH=/share/apps/gcc-6.2.0/lib/ /share/apps/gcc-6.2.0/bin/g++ -std=c++14 -Iunicycler/include -fPIC -lrt -lpthread -O3 -DNDEBUG -Wall -Wextra -pedantic -mtune=native -c -o unicycler/src/consensus_align.o unicycler/src/consensus_align.cpp
[...]
LD_LIBRARY_PATH=/share/apps/gcc-6.2.0/lib/ /share/apps/gcc-6.2.0/bin/g++ -std=c++14 -Iunicycler/include -fPIC -lrt -lpthread -O3 -DNDEBUG -Wall -Wextra -pedantic -mtune=native -Wl,-soname,unicycler/cpp_functions.so -o unicycler/cpp_functions.so unicycler/src/consensus_align.o unicycler/src/global_align.o unicycler/src/kmers.o unicycler/src/miniasm/asg.o unicycler/src/miniasm/asm.o unicycler/src/miniasm/dotter.o unicycler/src/miniasm/hit.o unicycler/src/miniasm/paf.o unicycler/src/miniasm/sdict.o unicycler/src/miniasm/sys.o unicycler/src/miniasm_assembly.o unicycler/src/minimap/bseq.o unicycler/src/minimap/index.o unicycler/src/minimap/kthread.o unicycler/src/minimap/map.o unicycler/src/minimap/misc.o unicycler/src/minimap/sdust.o unicycler/src/minimap/sketch.o unicycler/src/minimap_align.o unicycler/src/overlap_align.o unicycler/src/path_align.o unicycler/src/random_alignments.o unicycler/src/ref_seqs.o unicycler/src/scoredalignment.o unicycler/src/scrub.o unicycler/src/semi_global_align.o unicycler/src/semi_global_align_exhaustive.o unicycler/src/start_end_align.o unicycler/src/string_functions.o -shared -lz

but that does not change anything to the issue, as the problem lies in python ctypes function CDLL() (called in cpp_wrappers.py) to look into the wrong lib search path.

For Anaconda users, it seems that the issue is easilly fixed by reinstalling Anaconda with updated libs, cf. https://github.com/openai/gym/issues/543. However I can't do that as I use a shared HPC environment which I don't administrate. I could build my own Python with the right dependency, but it seems to me particularly heavy and non-generic solution to this problem.

Sadly, I've been looking around in forums and could not find a way to tell Python to use an alternate lib search path when calling CDLL() ... Any ideas?

shyamrallapalli commented 7 years ago

Hi Florent and Ryan, Sorry anaconda was my issue.

Florent compile as you have been doing

make CXX=/share/apps/gcc-6.2.0/bin/g++

then update LD_LIBRARY_PATH before calling unicycler script

export LD_LIBRARY_PATH=/share/apps/gcc-6.2.0/lib64:/share/apps/gcc-6.2.0/lib/:$LD_LIBRARY_PATH

./unicycler-runner.py

I think this should fix your issue.

I was actually doing this, before posting here, but was only providing gcc lib path just realized that I should include both lib and lib64 paths

flass commented 7 years ago

Hi Shyam, thanks very much it fixed the issue! I tried exporting the completed LD_LIBRARY_PATH before but did not add the lib64/ path and indeed that was the missing bit. Thanks again to both!

rrwick commented 7 years ago

I'm glad to hear it!