chanzuckerberg / shasta

[MOVED] Moved to paoloshasta/shasta. De novo assembly from Oxford Nanopore reads
Other
272 stars 59 forks source link

Illegal instruction (was: cmake error - build from source on linux centOS6) #157

Closed zhenzhenyang-psu closed 4 years ago

zhenzhenyang-psu commented 4 years ago

Hi Chan, I had an error during cmake while trying to install from source. I have installed all the prerequisite packages listed on your website and followed the following link: https://chanzuckerberg.github.io/shasta/BuildingFromSource.html

module load apps/cmake/3.7.0-rc3 module load compiler/gnu/5.5.0 module load apps/glib/2.14 conda activate graphviz

cmake -DCMAKE_C_COMPILER=/public/software/compiler/gnu/5.5.0/bin/gcc -DCMAKE_CXX_COMPILER=/public/software/compiler/gnu/5.5.0/bin/g++ ../shasta_github

cmake -DCMAKE_C_COMPILER=/public/software/compiler/gnu/5.5.0/bin/gcc -DCMAKE_CXX_COMPILER=/public/software/compiler/gnu/5.5.0/bin/g++ ../shasta -- The C compiler identification is GNU 5.5.0 -- The CXX compiler identification is GNU 5.5.0 -- Check for working C compiler: /public/software/compiler/gnu/5.5.0/bin/gcc -- Check for working C compiler: /public/software/compiler/gnu/5.5.0/bin/gcc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /public/software/compiler/gnu/5.5.0/bin/g++ -- Check for working CXX compiler: /public/software/compiler/gnu/5.5.0/bin/g++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- CMAKE_SYSTEM_NAME is Linux -- MACOS is OFF cat: /etc/os-release: No such file or directory -- BUILD_USE_SPOA_WITH_CPU_DISPATCH is OFF -- BUILD_STATIC_LIBRARY is ON -- BUILD_STATIC_EXECUTABLE is ON -- BUILD_DYNAMIC_LIBRARY is ON -- BUILD_DYNAMIC_EXECUTABLE is ON -- BUILD_APPIMAGE is OFF -- BUILD_NATIVE is OFF -- BUILD_DEBUG is OFF -- BUILD_ID is: Shasta development build. This is not a released version. CMake Error at dynamicLibrary/CMakeLists.txt:74 (string): string sub-command STRIP requires two arguments.

-- Configuring incomplete, errors occurred! See also "/public/home/yangzhzh/tools_zz/shasta-build/CMakeFiles/CMakeOutput.log".

Do you have any idea why? 'cat: /etc/os-release: No such file or directory' not sure if it has something to do with this?

thanks a lot, zhenzhen

zhenzhenyang-psu commented 4 years ago

CMakeOutput.log By the way, here is the CMakeOutput.log file. thanks!

paoloczi commented 4 years ago

As you probably noted in the Shasta documentation on building from source, the only Linux system on which we support building the code is Ubuntu. The static executable built on Ubuntu 20.04 (distributed in the latest release as shasta-Linux-0.5.0) runs on most current Linux distributions. For Linux distributions that use older kernels, the static executable built on Ubuntu 16.04 (distributed in the latest release as shasta-OldLinux-0.5.0) usually works without problems. This was not the case for you, however, and we will investigate why, and hopefully provide an executable capable of running on your CentOS 6 system. For that, it would help if you post the full output of uname -a on that system.

Porting to CentOS 6 is a non-trivial project because of the Shasta dependencies. The message you posted seems to indicate an old cmake version. Shasta has several dependencies, and you will need to make sure they are all available on CentOS6 before you can build there. In Ubuntu, we use shasta/scripts/InstallPrerequisites-Ubuntu.sh to install all necessary prerequisites. if you want to pursue porting on CentOS 6, you can use that script as a guide to the necessary requirements. However I don't recommend doing that. Instead, give us a couple of days to investigate why shasta-OldLinux-0.5.0 did not work for you and provide a solution.

zhenzhenyang-psu commented 4 years ago

HI Chan, Here below is the output of 'uname -a': Linux HPC-login 2.6.32-504.el6.x86_64 #1 SMP Wed Oct 15 04:27:16 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

yes, I actually followed the shasta/scripts/InstallPrerequisites-macos/ubuntu.sh and installed all the prerequisite packages. It took me a while but I was able to install all of them. Yet it still failed.

While writing this email, I tried './shasta-OldLinux-0.5.0 -h', miraculously it seems working by printing out usage without exporting any error. I would assume it is working tentatively. I would let you know if I run additional package related issues during running, which I hope won't happen.

thanks a lot for your answer and patience.

best, zhenzhen

zhenzhenyang-psu commented 4 years ago

However, during the run process, it still reported errors:/. /opt/gridview//pbs/dispatcher/mom_priv/jobs/1764342.node1.SC: line 17: 54293 Illegal instruction (core dumped)

so the shasta-OldLinux-0.5.0 might still have errors.

paoloczi commented 4 years ago

Please post the complete assembly log (stdout) up to the point of failure, or at the very least the last 20-40 lines of output.

zhenzhenyang-psu commented 4 years ago
Screen Shot 2020-06-22 at 12 31 00 PM

PBS_shasta_oldLinux2.o1764341.txt here is the output file and a screenshot of files under directory 'ShastaRun'. thanks, zhenzhen

paoloczi commented 4 years ago

Thank you. I suspect a problem in the Spoa library and filed an issue there. We will need an additional piece of information from you: the flags field for the processors in the failing system. You could use the following command to print that information:

grep flags /proc/cpuinfo | head -1

Depending on how long it takes to diagnose and fix the Spoa issue, we may have to downgrade Shasta back to an older version of Spoa, or perhaps just create for you a temporary Shasta build done with an older version of Spoa.

paoloczi commented 4 years ago

It turns out that the problem is on our side, not the Spoa library. We will provide soon a fixed executable for you to test.

bagashe commented 4 years ago

@zhenzhenyang-psu : Could you please download the Shasta binary from https://github.com/bagashe/shasta/suites/827089314/artifacts/9192038 and test it out? Clicking on that link will download shasta-OldLinux.zip. You will need to unzip it to extract the shasta binary.

paoloczi commented 4 years ago

@zhenzhenyang-psu, once we know that this fixed executable works for you we will create a new Shasta release with the fix.

zhenzhenyang-psu commented 4 years ago

sounds great! let me try the new link first. thanks, zhenzhen

获取 Outlook for Androidhttps://aka.ms/ghei36


From: paoloczi notifications@github.com Sent: Tuesday, June 23, 2020 1:12:05 AM To: chanzuckerberg/shasta shasta@noreply.github.com Cc: zhenzhenyang-psu yangzhenzhen1988@gmail.com; Mention mention@noreply.github.com Subject: Re: [chanzuckerberg/shasta] Illegal instruction (was: cmake error - build from source on linux centOS6) (#157)

@zhenzhenyang-psuhttps://github.com/zhenzhenyang-psu, once we know that this fixed executable works for you we will create a new Shasta release with the fix.

― You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/chanzuckerberg/shasta/issues/157#issuecomment-647654218, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACFCJW2L6QJ4OWMCD7F2ETTRX6GGLANCNFSM4ODZQAUQ.

zhenzhenyang-psu commented 4 years ago

hello Chan, this version seems to work well. See the output: Total length of assembled sequence is 2124866786 N50 for assembly segments is 70142 2020-Jun-23 07:46:04.125433 writeGfa1 begins 2020-Jun-23 07:47:35.955894 writeGfa1 ends 2020-Jun-23 07:47:36.183595 writeGfa1BothStrands begins 2020-Jun-23 07:50:37.875018 writeGfa1BothStrands ends 2020-Jun-23 07:50:38.160028 writeFasta begins 2020-Jun-23 07:52:10.835241 writeFasta ends 2020-Jun-23 07:53:02.594301 Assembly time statistics: Elapsed seconds: 2801.99 Elapsed minutes: 46.6998 Elapsed hours: 0.778331 Average CPU utilization: 0.138033 This run used options "--memoryBacking 4K --memoryMode anonymous". This could have resulted in performance degradation. For full performance, use "--memoryBacking 2M --memoryMode filesystem" (root privilege via sudo required). Therefore the results of this run should not be used for benchmarking purposes. Shasta unreleased test build newer than release 0.5.0 at commit a319e3908c1a09fe82ddaa0dadbdfab63ccdf742

I don't understand about this: This run used options "--memoryBacking 4K --memoryMode anonymous". This could have resulted in performance degradation. For full performance, use "--memoryBacking 2M --memoryMode filesystem" I won't be able to have root privilege with sudo. So can the results be trusted? The assembled genome size seems good but the N50 is ~5 times smaller than from wtdbg2, which is 379k.

thanks, zhenzhen

paoloczi commented 4 years ago

Great, thank you for letting us know that the Illegal instruction problem is fixed. We will create a new release with the fix.

The memory options only affect assembly performance (that is, speed), not assembly quality. If you don't have sudo access you can continue to work with the default memory options --memoryBacking 4K --memoryMode anonymous and assembly results will not be affected - just be aware that in that mode Shasta is running slower than what is possible on your system.

The assembly is fragmented because you used Shasta default parameters, which are optimized for coverage 60x. Based on our previous discussion in issue #156, you are operating at coverage around 10x. There are some suggestions in issues #156 and #7 on how to optimize assembly parameters for low coverage, but we have not had the time to pursue that.

bagashe commented 4 years ago

@zhenzhenyang-psu - We just released version 0.5.1 to include a fix for this issue. Thank you for bringing it to our attention and helping test the fix.

zhenzhenyang-psu commented 4 years ago

sure, no problem. Thanks for resolving this issue for me. By the way, this is the brief assembly result with the default parameters at 10x coverage: Total length of assembled sequence is 50,667,228 N50 for assembly segments is 23,775

With the optimized parameters for 10x coverage, the result is improved: Total length of assembled sequence is 2124866786 N50 for assembly segments is 70142

And the commands are: /public/home/yangzhzh/tools_zz/shasta --input /public/home/yangzhzh/projects/0_aiden_lab/5_Olga/7_broad-toothed_rat/1_raw_data/1_Promethion/fastq_pass/*.fastq \ --threads 32 --MinHash.maxBucketSize 2 --MarkerGraph.minCoverage 2 --MarkerGraph.maxCoverage 20 --MarkerGraph.highCoverageThreshold 43 --MarkerGraph.edgeMarkerSkipThreshold 20 \ --MinHash.minHashIterationCount 20 --MinHash.minFrequency 1 --Align.minAlignedMarkerCount 50

Thanks, zhenzhen

paoloczi commented 4 years ago

Thanks for the info @zhenzhenyang-psu . This is progress, but certainly not a useful assembly. Can I ask why you insist on assembling at such low coverage? No matter what, the quality of the assembly will certainly be inferior to what you can achieve at at a more standard 60x. The cost of obtaining additional coverage with nanopore data is not prohibitive, so increasing coverage seems the most logical option.

If assembling at low coverage is an important application, at some point we can look at making Shasta better for this mode of operation. So far, all of our efforts have been on optimizing assembly quality at coverage around 60x.

zhenzhenyang-psu commented 4 years ago

hi Chan, the reason is that only so much nanopore data was generated from our collaborator. We actually have another species with a genome size of 7Gb, sequencing with 60x coverage would be a lot more expensive. it would be good to know the minimum coverage required for moderate assembly quality. thanks, zhenzhen

获取 Outlook for Androidhttps://aka.ms/ghei36


From: paoloczi notifications@github.com Sent: Wednesday, June 24, 2020 10:02:37 PM To: chanzuckerberg/shasta shasta@noreply.github.com Cc: zhenzhenyang-psu yangzhenzhen1988@gmail.com; Mention mention@noreply.github.com Subject: Re: [chanzuckerberg/shasta] Illegal instruction (was: cmake error - build from source on linux centOS6) (#157)

Thanks for the info @zhenzhenyang-psuhttps://github.com/zhenzhenyang-psu . This is progress, but certainly not a useful assembly. Can I ask why you insist on assembling at such low coverage? No matter what, the quality of the assembly will certainly be inferior to what you can achieve at at a more standard 60x. The cost of obtaining additional coverage with nanopore data is not prohibitive, so increasing coverage seems the most logical option.

If assembling at low coverage is an important application, at some point we can look at making Shasta better for this mode of operation. So far, all of our efforts have been on optimizing assembly quality at coverage around 60x.

― You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/chanzuckerberg/shasta/issues/157#issuecomment-648839986, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACFCJW5AMW2YWYNRLRGRGQTRYIBP3ANCNFSM4ODZQAUQ.

paoloczi commented 4 years ago

Hopefully at some point we will find some time to investigate and improve lower coverage assemblies.

zhenzhenyang-psu commented 4 years ago

thanks!