CartwrightLab / dawg

Simulating Sequence Evolution
GNU General Public License v2.0
11 stars 3 forks source link

error while loading shared libraries: libdawg2.so.1 #46

Closed Morteza-M-Saber closed 6 years ago

Morteza-M-Saber commented 6 years ago

Trying to install dawg on Supercomputer with Red Hat. The compilation was done successfully producing dawg and libdawg2.so files but since I don't have root privilege, I had to use DESTDIR=/home/user/my_program/bin after 'make install' command to install it locally. After installation and trying to run './dawg --help', the following error arises:

 ./dawg: error while loading shared libraries: libdawg2.so.1: cannot open shared object file: No such file or directory

It seems that the dawg is looking for libdawg2.so.1 in the directory: 'usr/local/lib' while this file is in '/home/user/my_program/bin/usr/local/lib'.

How it can be instructed to dawg where to look for libdawg2.so.1 ?

reedacartwright commented 6 years ago

The quickest way to get this to work is set LD_LIBRARY_PATH to contain the path of the dawg library:

export LD_LIBRARY_PATH=${HOME/}my_program/bin/usr/local/lib/:${LD_LIBRARY_PATH}

You also should be able to set CMAKE_INSTALL_PREFIX at the configuration stage and get the correct path added to the RPATH value in the binary.

Morteza-M-Saber commented 6 years ago

THANK YOU VERY MUCH. This fix the problem but after that I get this strange error:

./dawg: symbol lookup error: ./dawg: undefined symbol: _ZN5boost15program_options3argB5cxx11E

I am really confused what this error. Do you might know how to bypass this?

reedacartwright commented 6 years ago

This is a trickier problem to solve, and has to do with the dual ABI present in recent versions of GCC (https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_dual_abi.html).

The short solution is that you will need to make sure that you are linking against a version of boost that was compiled with the same compiler you are using for Dawg. Likely the version you are linking against was compiled with the default GCC 4.2 compiler on the RedHat machine.

Morteza-M-Saber commented 6 years ago

Thanks again. And finally Just for running the program, in the help command ( 'Root.Seq - A specific root sequence.') it is mentioned that for determining the user specified root genome it can be specified by Seq Parameter. I have generated a Fasta file file with 2 sequences and then add the following to the input file:

[Root] Seq = "/home/.../examples/test/test.fasta"

But it seems that program can not read sequences from the fasta file. Changing the Seq parameter with Length=1000 makes DAWG work properly. What am I doing wrong here?

reedacartwright commented 6 years ago

@zmertens is currently implementing and testing the Root.Seq feature, it will be fix by next month.

Morteza-M-Saber commented 6 years ago

Thanks for informing me about this. Looking forward to your next update @zmertens

Morteza-M-Saber commented 6 years ago

@zmertens I was just wondering whether Root.Seq feature is going to be added anytime soon? I badly need it :-\

reedacartwright commented 6 years ago

It will be added this week.

On Feb 6, 2018 23:21, "Syzarta" notifications@github.com wrote:

@zmertens https://github.com/zmertens I was just wondering whether Root.Seq feature is going to be added anytime soon? I badly need it :-\

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/reedacartwright/dawg/issues/46#issuecomment-363568360, or mute the thread https://github.com/notifications/unsubscribe-auth/AAGOHu6WOQCNJGDBF6GhyIHhj7Alf6hLks5tSMI0gaJpZM4RW3xZ .

zmertens commented 6 years ago

@Morteza-M-Saber If you want to try out the root sequence patch you can try working off the branch directly . It would get a chance to get any feedback before merging it into develop. You should be able to specify the root sequence in a trick file using Root.Seq = "ACGT"

Quickest way to get to it from your DAWG repo would be to: git remote add zmertens https://github.com/zmertens/dawg and then: git checkout -b specify_root_sequence zmertens/specify_root_sequence . Or I think you can checkout the pull request directly using: git fetch reedacartwright pull/36/head:specify_root_sequence

Morteza-M-Saber commented 6 years ago

@zmertens Thanks for the hints. I tried a lot to make it work the way you mentioned, but the git command all give me the same error as following:

git remote add zmertens https://github.com/zmertens/dawg
fatal: Not a git repository (or any of the parent directories): .git

And adding Root.Seq='ATCG' instead of length just returns empty output with my current installation.

Root.Seq is a very useful option. It would be good to consider releasing it early.

Morteza-M-Saber commented 6 years ago

Another strange error occurs when running dawg is that apparently in some cases like the following it uses lots of RAM.

[Tree]
Tree = "(((((SE001:62.266694,SE015:66.474319):7165.9207,(SE007:4634.5104,SE011:6973.0207):1514.5568):3764.3068,(SE005:2333.9454,SE010:2272.8238):5132.867):7909.666,(SE003:395.203,SE008:356.70274):12108.414):4838.4497,((SE002:5922.2363,((SE006:42.249613,SE014:70.419555):3995.7079,(SE009:2236.7044,(SE012:1369.3953,SE013:5050.2437):1310.5648):549.20299):2972.2655):4238.3904,SE004:9122.1838):10449.388):0;"
[Subst]
Model = GTR
Params = 0.99,0.82,0.37,0.56,0.23,1.0
Freqs = 0.25,0.25,0.25,0.25
Rate.Model = Gamma
Rate.Params = 1.0,0.01
[Indel]
Model = GEO
Rate = 0.0075
Params = 0.01
Max = 10
[Root]
Lenght=10
[Sim]
Reps = 1
Seed = 1234
[Output]
File = dawg_IR1.dawg.fsa

Running this on supercomputer with 30GB available RAM returns the following error:

ERROR: std::bad_alloc

which usually occur when the program requires more RAM than provided. Do you recommend anything not to get this error?

reedacartwright commented 6 years ago

Your branch lengths are enormous. In dawg branch lengths represent the expected number of substitutions per site, so you are simulating 10,000s of substitutions at every site and 1,000s of indels. Try setting the Tree.Scale parameter to 1e-4 or smaller to control the simulation.

On Feb 21, 2018 09:43, "Syzarta" notifications@github.com wrote:

Another strange error occurs when running dawg is that apparently in some cases like the following it uses lots of RAM.

[Tree] Tree = "(((((SE001:62.266694,SE015:66.474319):7165.9207,(SE007:4634.5104,SE011:6973.0207):1514.5568):3764.3068,(SE005:2333.9454,SE010:2272.8238):5132.867):7909.666,(SE003:395.203,SE008:356.70274):12108.414):4838.4497,((SE002:5922.2363,((SE006:42.249613,SE014:70.419555):3995.7079,(SE009:2236.7044,(SE012:1369.3953,SE013:5050.2437):1310.5648):549.20299):2972.2655):4238.3904,SE004:9122.1838):10449.388):0;" [Subst] Model = GTR Params = 0.99,0.82,0.37,0.56,0.23,1.0 Freqs = 0.25,0.25,0.25,0.25 Rate.Model = Gamma Rate.Params = 1.0,0.01 [Indel] Model = GEO Rate = 0.0075 Params = 0.01 Max = 10 [Root] Lenght=10 [Sim] Reps = 1 Seed = 1234 [Output] File = dawg_IR1.dawg.fsa

Running this on supercomputer with 30GB available RAM returns the following error:

ERROR: std::bad_alloc

which usually occur when the program more RAM than provided. Do you recommend anything not to get this error?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/reedacartwright/dawg/issues/46#issuecomment-367389315, or mute the thread https://github.com/notifications/unsubscribe-auth/AAGOHluy3s6mza_p4Rgzex5ZPlMkTxu2ks5tXEdpgaJpZM4RW3xZ .

zmertens commented 6 years ago

@Morteza-M-Saber #35 is merged into develop now so you can disregard my previous post

Morteza-M-Saber commented 6 years ago

The program works perfectly now. Thank you very much. Just one point I wanted to mention to improve DAWG. It would have been great if the program could also take Root.Seq as fasta file and return the simulated sequences for each locus in the file like what AlfSim tool does.

It would save lots of time because for example if one want to simulate 1000 promoter sequences and put each simulated sequence back to its actual coordinate in the genome, 1000 parameter files and 1000 runs are required but if the fasta file could be used as Root.Seq with simulation done for each locus individually, one parameter file and one simulation would have been enough.

Anyway, that you very much for great tool.