vgteam / vg

tools for working with genome variation graphs
https://biostars.org/tag/vg/
Other
1.09k stars 193 forks source link

vg convert failes while converting gbz to xg with signal 6 error #4372

Open leleory opened 1 month ago

leleory commented 1 month ago

1. What were you trying to do? I want to convert a gbz pangenome graph (created using cactus-pangenome) into xg format.

vg convert -x canidpg.gbz > canidpg.xg

2. What did you want to happen? The expectation was to get an xg graph file.

3. What actually happened? vg chrased with:

ERROR: Signal 6 occurred. VG has crashed.

4. If you got a line like Stack trace path: /somewhere/on/your/computer/stacktrace.txt, please copy-paste the contents of that file here:

vg: /public/home/anovak/build/vg/include/mmmultimap.hpp:242: void mmmulti::map<Key, Value>::sort(int) [with Key = long unsigned int; Value = std::tuple<long unsigned int, long unsigned int, long unsigned int>]: Assertion `false' failed.
bbbbbbbbbbbbbbbbbbbb
Crash report for vg v1.58.0 "Cartari"
Stack trace (most recent call last):
#13   Object "/exports/cmvm/src/vg/v1.58.0/vg", at 0x61f764, in _start
#12   Object "/exports/cmvm/src/vg/v1.58.0/vg", at 0x20f7be6, in __libc_start_main
#11   Object "/exports/cmvm/src/vg/v1.58.0/vg", at 0x20f6349, in __libc_start_call_main
#10   Object "/exports/cmvm/src/vg/v1.58.0/vg", at 0xe14beb, in vg::subcommand::Subcommand::operator()(int, char**) const
#9    Object "/exports/cmvm/src/vg/v1.58.0/vg", at 0xcc13bb, in main_convert(int, char**)
#8    Object "/exports/cmvm/src/vg/v1.58.0/vg", at 0xcbe9d0, in graph_to_xg_adjusting_paths(handlegraph::PathHandleGraph const*, xg::XG*, std::unordered_set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, bool)
#7    Object "/exports/cmvm/src/vg/v1.58.0/vg", at 0x18c8fea, in xg::XG::from_enumerators(std::function<void (std::function<void (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, long long const&)> const&)> const&, std::function<void (std::function<void (long long const&, bool const&, long long const&, bool const&)> const&)> const&, std::function<void (std::function<void (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, long long const&, bool const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool const&, bool const&)> const&)> const&, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)
#6    Object "/exports/cmvm/src/vg/v1.58.0/vg", at 0x18c246e, in xg::XG::index_node_to_path(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
#5    Object "/exports/cmvm/src/vg/v1.58.0/vg", at 0x18fd126, in mmmulti::map<unsigned long, std::tuple<unsigned long, unsigned long, unsigned long> >::sort(int)
#4    Object "/exports/cmvm/src/vg/v1.58.0/vg", at 0x21087b5, in __assert_fail
#3    Object "/exports/cmvm/src/vg/v1.58.0/vg", at 0x5ee093, in __assert_fail_base.cold
#2    Object "/exports/cmvm/src/vg/v1.58.0/vg", at 0x5ee16b, in abort
#1    Object "/exports/cmvm/src/vg/v1.58.0/vg", at 0x210edc5, in raise
#0    Object "/exports/cmvm/src/vg/v1.58.0/vg", at 0x213b91c, in __pthread_kill
ERROR: Signal 6 occurred. VG has crashed. Visit https://github.com/vgteam/vg/issues/new/choose to report a bug.
Please include this entire error log in your bug report!
bbbbbbbbbbbbbbbbbbbb

5. What data and command can the vg dev team use to make the problem happen? I run vg convert on both the canidpg.gbz and canidpg.d2.gbz output files from cactus-pnagenome and in both cases vg crashed with the same error message.

6. What does running vg version say? I got the same error using two different vg releases (v1.56.0 and v1.58.0)

vg version v1.56.0 "Collalto"
Compiled with g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 on Linux
Linked against libstd++ 20230528
Built by anovak@courtyard.gi.ucsc.edu
vg version v1.58.0 "Cartari"
Compiled with g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 on Linux
Linked against libstd++ 20230528
Built by anovak@courtyard.gi.ucsc.edu

Thank you, Lel

jltsiren commented 1 month ago

XG construction uses temporary files. The error message you got indicates that memory-mapping one of the files failed. The most likely reason is that you ran out of space in the temporary directory. You could try using another temporary directory to see if that solves the issue:

export TMPDIR=/somewhere/else
leleory commented 4 weeks ago

Thanks for the suggestion, @jltsiren.

Unfortunately, this is not a problem with not enough temporary storage. There is 6TB storage available for vg and the maximum it is using before the crash is below 100G.

It is neither a memory problem. At the time of the crash vg is only using 120GB of RAM out of the 300GB.

The gbz also seems to be valid, at least vg validate says that that is the case.

I am not sure how to proceed and what else to check to figure out what could be the problem.

jltsiren commented 4 weeks ago

Can you share the graph?

leleory commented 4 weeks ago

What is the best way to share it? The gbz file is 9.3GB.

jltsiren commented 4 weeks ago

What options do you have available?

leleory commented 3 weeks ago

If you can give me an e-mail address I can send you a link, if that is OK.

jltsiren commented 3 weeks ago

You can send it to jlsiren@ucsc.edu.

jltsiren commented 3 weeks ago

I managed to build a full XG for the graph you sent. The *.node_path.mm file (which was the cause of the crash you reported) took 309 GB. The construction took several hours, and peak memory usage was ~300 GB, plus whatever was available for caching the memory-mapped file. The size of the final XG was 139 GB.

If you only need an XG with reference paths but no haplotype paths, you can build that by adding option --drop-haplotypes to the vg convert command. That should finish quickly, and the size of the XG should be around 16 GB.