chanzuckerberg / shasta

[MOVED] Moved to paoloshasta/shasta. De novo assembly from Oxford Nanopore reads
Other
272 stars 59 forks source link

ld ERROR when compiled #127

Closed Axolotl233 closed 4 years ago

Axolotl233 commented 4 years ago

Hello, when I compiled with source code, some ERROR occured and aborted complie. Here are messages on STDERR:

[ 49%] Built target shastaStaticLibrary [ 50%] Linking CXX executable shasta /usr/bin/ld: cannot find -latomic /usr/bin/ld: cannot find -lboost_system /usr/bin/ld: cannot find -lboost_program_options /usr/bin/ld: cannot find -lboost_chrono /usr/bin/ld: cannot find -lpng /usr/bin/ld: cannot find -lz /usr/bin/ld: cannot find -lpthread /usr/bin/ld: cannot find -lstdc++ /usr/bin/ld: cannot find -lm /usr/bin/ld: cannot find -lc collect2: error: ld returned 1 exit status make[2]: [staticExecutable/CMakeFiles/shastaStaticExecutable.dir/build.make:85:staticExecutable/shasta] Error 1 make[1]: [CMakeFiles/Makefile2:146:staticExecutable/CMakeFiles/shastaStaticExecutable.dir/all] Error 2 make: *** [Makefile:130: all] Error 2

after check, I am sure all needed lib files have installed in /lib64, and I can not change the value of LD_LIBRARY_PATH may be because of administrator set up. so how can i solve this problem?

the system info of computer is Linux version 4.18.0-80.11.2.el8_0.x86_64 (mockbuild@kbuilder.bsys.centos.org) (gcc version 8.2.1 20180905 (Red Hat 8.2.1-3) (GCC)) #1 SMP Tue Sep 24 11:32:19 UTC 2019 with CentOS Linux release 8.0.1905 (Core)

Axolotl233 commented 4 years ago

now i can exchange LD_LIBRARY_PATH but Error still exist.

paoloczi commented 4 years ago

The Shasta code currently only supports building on Ubuntu. However, the static executable built on Ubuntu runs on most current Linux platforms including CentOS. To get a static executable built on Ubuntu you have three choices:

If after this you are still unable to run on CentOS, please let me know.

Axolotl233 commented 4 years ago

i will test static executable shasta 0.4.0 soon, thank you

Axolotl233 commented 4 years ago

excuse me to bother again, I have tested static executable shasta-Linux program but still have ERROR. The ERROR maybe because of user's privilege,but when exchange as root,Error still exist here are STDERR of shasta,i just cut lines nearby ERROR

2020-May-15 17:30:20.157127 writeGfa1 begins 2020-May-15 17:30:21.842671 writeGfa1 ends 2020-May-15 17:30:22.036400 writeGfa1BothStrands begins 2020-May-15 17:30:25.412934 writeGfa1BothStrands ends 2020-May-15 17:30:26.095835 writeFasta begins 2020-May-15 17:30:27.795116 writeFasta ends Assembly time statistics: Elapsed seconds: 245.992 Elapsed minutes: 4.09986 Elapsed hours: 0.068331 Average CPU utilization: 0.612532 This run used options "--memoryBacking 4K --memoryMode anonymous". This could have resulted in performance degradation. For full performance, use "--memoryBacking 2M --memoryMode filesystem" (root privilege via sudo required). Therefore the results of this run should not be used for benchmarking purposes. Shasta Release 0.4.0 ./shasta-Linux-0.4.0 --input 11779.27s user 282.12s system 4763% cpu 4:13.22 total root@lz34# whoami root

could you help me again, I confident shasta can solve problem haunt me

paoloczi commented 4 years ago

There is no error here. That run completed normally. You are just getting a warning that, because of the options you are using, you are getting less than top performance.

Since you have root privilege, you can run your assembly with the following options to get full performance:

--memoryMode filesystem --memoryBacking 2M

This will not make a big difference in performance for the small assembly you are testing with, but the difference could be significant in a large assembly.

Those options allocate Shasta data structures on Linux huge pages (2 MB pages) backed by the hugetlbfs filesystem. For various reasons, Shasta does not free that memory when the assembly terminates. To free it, you have to use the following command:

shasta --command cleanupBinaryData --assemblyDirectory myRun

Here, make sure to replace myRun with the name of the assembly directory you specified when running the assembly (default ShastaRun). See here for more information on Shasta memory modes.

Axolotl233 commented 4 years ago

Thank you ~ Now I didn't have any warning using your guide, but when I test Shasta with Arabidopsis thaliana nanopore data ERR2173373 ~27x (3.4GB), the assembly result seems to be not correct:

Total length of assembled sequence is 83347597
N50 for assembly segments is 282126

I can get size of 119Mb assembly when used WTDBG2 with same data , may be low coverage of data lead to this odd result?

paoloczi commented 4 years ago

Shasta default assembly parameters are tuned for nanopore reads at coverage around 60x, and usually work well between 40x and 80x. There is also a configuration file shasta/conf/Nanopore-Dec-2019.conf which might give better results than default parameters, but is also for coverage around 60x. To pass a configuration file to Shasta, use --conf /absolutePathToConfigurationFile (a relative path is not accepted).

Your coverage is lower, so you will have to tune assembly parameters. I don't have a configuration file for low coverage, as I have never had the time to experiment with assemblies at low coverage. Fortunately nanopore reads are inexpensive enough that obtaining a good amount of coverage is usually practical, particularly for a small genome like this one.

If you post the following from your assembly I can help:

Also, please confirm what read technology you are using (nanopore, other?). I have heard of somebody obtaining a successful assembly of Arabidopsis thaliana using Shasta.

Axolotl233 commented 4 years ago

Ath_test.zip

now i have tested Shasta with 77x nanopore reads(ERR2228557.1), it have big improve: total length is 109mb, N50 is 1.99mb. But it still have difference compare with wtdbg2.

the log file have uploaded in appendix (Ath.log), could you have some advices to improve assembly further more?

paoloczi commented 4 years ago

The number of alignments found is on the low side. It looks like you used default assembly parameters. Assuming these are nanopore reads, you would probably get better results using Shasta configuration file shasta/conf/Nanopore-Dec2019.conf. Use --conf /absolutePathToConfigurationFile to pass a configuration file to Shasta (a relative path is not accepted). For more information on Shasta configuration files, see here.

If your assembly is still not satisfactory, please attach the same output and we can see what needs to be tweaked (that configuration file is tuned for human assemblies).

Shasta is generally conservative, and tends to assemble less sequence than other assemblers, but at better accuracy. See our paper for some discussion on this.

Axolotl233 commented 4 years ago

Ath.test2.zip

sorry to bother you again.

I have tested Shasta using same Ath's fasta with configuration file (Nanopore-Dec2019.conf), But it didn't seem to work well. only 25834848bp genome were assembled and N50 was 36144bp.

Samely, the output have uploaded in appendix and hope you can explain what effect of every parameter and how to adjust it in different condition.

paoloczi commented 4 years ago

In this assembly, the LowHash algorithm found only 1011612 alignment candidates for 423183 reads. This is insufficient, as usually for a good assembly you need at least 5-10 alignment candidates per read. So something needs to be changed in the MinHash/LowHash parameters (at least). The assembly log shows that the number of alignment candidates was still increasing when the MinHash iteration terminated, so the first thing to try would be to increase the number of MinHash iterations (currently 10), for example:

--MinHash.minHashIterationCount 50

MinHash parameters, like alignment parameters, are sensitive to data quality. Can you confirm what type of reads you are using? Are these nanopore reads? If so, what base caller and version was used to generate these reads? The configuration file I pointed you to works well for nanopore reads created by the Guppy 3.0.5 base caller. Earlier versions produce lower quality reads, for which the MinHash and Align parameters in that configuration file might not be adequate and might require tuning.

For a description of assembly parameters more extensive than you get with shasta --help, see here. Some parameters have a question mark enclosed in a rectangle which is a link to the relevant section of the page on computational method. However, this documentation is not as complete as it should be, so feel free to ask questions here.

Axolotl233 commented 4 years ago

sorry for my personal reason, I can't test Shasta recently. you can close this tissue and i would reply when I tested Shasta again~ Finally, thank you all.

paoloczi commented 4 years ago

Feel free to create a new issue when you get back to it , if you still cannot get a satisfactory assembly.