pangenome / pggb

the pangenome graph builder
https://doi.org/10.1038/s41592-024-02430-3
MIT License
369 stars 40 forks source link

odgi::gfa_to_handle Command terminated by signal 11 #141

Closed egluckthaler closed 2 years ago

egluckthaler commented 2 years ago

Hi there, thanks for making this great software available. My error looks similar but distinct from issue #139 , so I figured I'd post it. I've been running pggb v0.1.3 (cloned from github) on a collection of 361 50Mb genomes, chromosome by chromosome. The pggb pipeline for 18/21 chromosomes successfully completed, but I am getting the following error with the 3 remaining chromosomes at a odgi::gfa_to_handle step just after the second call of smoothxg::main

[smoothxg::main] unchopping smoothed graph [odgi::unchop] unchopped 9049 nodes into 3573 new nodes. [smoothxg::main] smoothed graph length 119666901bp in 1811962 nodes [smoothxg::main] writing smoothed graph to IPO323.chr_5.fasta.10111f1.8b7f2f6.f37f590.smooth.gfa smoothxg -t 24 -T 24 -g IPO323.chr_5.fasta.10111f1.8b7f2f6.f37f590.smooth.1.gfa -w 4772059 -K -X 100 -d 2000 -I 0 -R 0 -j 100 -e 0 -l 13219 -p 1,19,39,3,81,1 -O 0.001 -Q Consensus_ -V -o IPO323.chr_5.fasta.10111f1.8b7f2f6.f37f590.smooth.gfa 312470.88s user 28337.46s system 1768% cpu 19266.12s total 145814260Kb max memory [odgi::gfa_to_handle] building nodes: 0.00% @ 1.78e+02/s elapsed: 00:00:00:00 remain: 00:02:49:41 [odgi::gfa_to_handle] building nodes: 100.00% @ 8.34e+05/s elapsed: 00:00:00:02 remain: 00:00:00:00 [odgi::gfa_to_handle] building edges: 0.00% @ 9.89e+00/s elapsed: 00:00:00:00 remain: 02:22:31:44 [odgi::gfa_to_handle] building edges: 100.00% @ 6.84e+05/s elapsed: 00:00:00:03 remain: 00:00:00:00 [odgi::gfa_to_handle] building paths: 0.01% @ 3.90e+00/s elapsed: 00:00:00:00 remain: 00:00:44:16 [odgi::gfa_to_handle] building paths: 1.38% @ 2.82e+02/s elapsed: 00:00:00:00 remain: 00:00:00:36 [odgi::gfa_to_handle] building paths: 2.82% @ 3.86e+02/s elapsed: 00:00:00:00 remain: 00:00:00:26 [odgi::gfa_to_handle] building paths: 5.34% @ 5.47e+02/s elapsed: 00:00:00:01 remain: 00:00:00:17Command terminated by signal 11 odgi build -t 24 -P -g IPO323.chr_5.fasta.10111f1.8b7f2f6.f37f590.smooth.fix.gfa -o - -O 19.24s user 1.13s system 228% cpu 8.89s total 2141432Kb max memory warning [libhandlegraph]: Serialized object does not appear to match deserialzation type. warning [libhandlegraph]: It is either an old version or in the wrong format. warning [libhandlegraph]: Attempting to load it anyway. Future releases will reject it! terminate called after throwing an instance of 'std::runtime_error' what(): Error rewinding to load non-magic-prefixed SerializableHandleGraph Command terminated by signal 6

Here is my command: /home/emile/software/pggb/pggb -i IPO323.chr_5.fasta -s 50000 -k 700 -p 95 -n 361 -t 24 --exclude-delim '.' --normalize --vcf-spec IPO323:chr_5.ids

Any insight would be greatly appreciated. I've tried increasing the number of threads, but still get the same error. I've attached the log file edited to remove progress printouts to reduce its size: IPO323.chr_5.fasta.10111f1.8b7f2f6.f37f590.10-27-2021_08:24:36.log

Thanks!

AndreaGuarracino commented 2 years ago

Hi @egluckthaler, it seems that an invalid GFA file is being produced for those chromosomes.

Any chance of sharing the smallest GFA among those that give problems? In the example you shared, it would be the IPO323.chr_5.fasta.10111f1.8b7f2f6.f37f590.smooth.fix.gfa file.

egluckthaler commented 2 years ago

Sure thing! Here is a link to the smallest file hosted on google drive

AndreaGuarracino commented 2 years ago

Thank you! That's strange because odgi build worked for me.

odgi build -g IPO323.chr_14.fasta.7c6d9e9.8b7f2f6.9fcb6c1.smooth.gfa -o IPO323.chr_14.fasta.7c6d9e9.8b7f2f6.9fcb6c1.smooth.og -t 16 -P
[odgi::gfa_to_handle] building nodes: 100.00% @ 7.64e+05/s elapsed: 00:00:00:00 remain: 00:00:00:00
[odgi::gfa_to_handle] building edges: 100.00% @ 1.02e+06/s elapsed: 00:00:00:00 remain: 00:00:00:00
[odgi::gfa_to_handle] building paths: 100.00% @ 2.43e+03/s elapsed: 00:00:00:02 remain: 00:00:00:00

although the resulting graph looks a little crazy (an image at the bottom of this message).

Probably the offending graph is the one with the suffix fix. We have had a similar problem in the past, which has been fixed in the V0.2 version of pggb which will be released soon.

Could you please try again by updating pggb to the V0.2 version here https://github.com/pangenome/pggb/pull/138? Be sure to update all tools to the commit versions specified here https://github.com/pangenome/pggb/blob/v0.2-pre-release/Dockerfile.

Bonus image: odgi viz -i IPO323.chr_14.fasta.7c6d9e9.8b7f2f6.9fcb6c1.smooth.og -o IPO323.chr_14.fasta.7c6d9e9.8b7f2f6.9fcb6c1.smooth.png -a 1

IPO323 chr_14 fasta 7c6d9e9 8b7f2f6 9fcb6c1 smooth

`

egluckthaler commented 2 years ago

Great, I'll give that a try. Thanks so much for your help, and for the cool viz!

AndreaGuarracino commented 2 years ago

Hi @egluckthaler, any luck?