vgteam / vg

tools for working with genome variation graphs
https://biostars.org/tag/vg/
Other
1.09k stars 193 forks source link

vg giraffe - bad_alloc() when forming GBZ index #3957

Closed asherrar closed 1 year ago

asherrar commented 1 year ago

1. What were you trying to do? Trying to replicate the Giraffe pipeline from A draft human pangenome reference using the CHM13 pangenome reference.

2. What did you want to happen? For vg giraffe to run successfully.

3. What actually happened?

Preparing Indexes
[IndexRegistry]: Combining Giraffe GBWT and GBWTGraph into GBZ.
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
ERROR: Signal 6 occurred. VG has crashed. Visit https://github.com/vgteam/vg/issues/new/choose to report a bug.
Stack trace path: /tmp/vg_crash_FHUQg2/stacktrace.txt
Please include the stack trace file in your bug report!

4. If you got a line like Stack trace path: /somewhere/on/your/computer/stacktrace.txt, please copy-paste the contents of that file here:

#28   Object "/vg/bin/vg", at 0x5efbad, in _start
#27   Object "/vg/bin/vg", at 0x1ee792f, in __libc_start_main
#26   Object "/vg/bin/vg", at 0x5bfa7e, in main
#25   Object "/vg/bin/vg", at 0xd5a85b, in vg::subcommand::Subcommand::operator()(int, char**) const
#24   Object "/vg/bin/vg", at 0xcb9fd5, in main_giraffe(int, char**)
#23   Object "/vg/bin/vg", at 0x126a00e, in vg::IndexRegistry::make_indexes(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)
#22   Object "/vg/bin/vg", at 0x1254ec8, in vg::IndexRegistry::execute_recipe(std::pair<std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, unsigned long> const&, vg::IndexingPlan const*, vg::AliasGraph&)
#21   Object "/vg/bin/vg", at 0x1246f0d, in std::_Function_handler<std::vector<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::allocator<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > > (std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&), vg::VGIndexes::get_vg_index_registry()::{lambda(std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)#55}>::_M_invoke(std::_Any_data const&, std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*&&, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)
#20   Object "/vg/bin/vg", at 0x1246cf0, in vg::VGIndexes::get_vg_index_registry()::{lambda(std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)#55}::operator()(std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&) const [clone .isra.0]
#19   Object "/vg/bin/vg", at 0x12371ea, in vg::load_gbz(gbwtgraph::GBZ&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)
#18   Object "/vg/bin/vg", at 0x1237032, in vg::load_gbwtgraph(gbwtgraph::GBWTGraph&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)
#17   Object "/vg/bin/vg", at 0x1238618, in std::unique_ptr<gbwtgraph::GBWTGraph, std::default_delete<gbwtgraph::GBWTGraph> > vg::io::VPKG::load_one<gbwtgraph::GBWTGraph>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
#16   Object "/vg/bin/vg", at 0xd312be, in std::_Function_handler<void (std::istream&), vg::io::VPKG::try_load_one<gbwtgraph::GBWTGraph>(std::istream&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)::{lambda(std::istream&)#1}>::_M_invoke(std::_Any_data const&, std::istream&)
#15   Object "/vg/bin/vg", at 0x880b7a, in vg::io::VPKG::with_putback(std::istream&, std::function<void (std::istream&)> const&)
#14   Object "/vg/bin/vg", at 0xd32706, in std::_Function_handler<void (std::istream&), vg::io::VPKG::try_load_bare<gbwtgraph::GBWTGraph>(std::istream&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)::{lambda(std::istream&)#1}>::_M_invoke(std::_Any_data const&, std::istream&)
#13   Object "/vg/bin/vg", at 0x141ddcf, in std::_Function_handler<void* (std::istream&), vg::io::register_loader_saver_gbwtgraph()::{lambda(std::istream&)#1}>::_M_invoke(std::_Any_data const&, std::istream&)
#12   Object "/vg/bin/vg", at 0x1835fee, in handlegraph::Serializable::deserialize(std::istream&)
#11   Object "/vg/bin/vg", at 0x1548b0b, in gbwtgraph::GBWTGraph::deserialize_members(std::istream&)
#10   Object "/vg/bin/vg", at 0x15e4bc9, in gbwt::StringArray::load(std::istream&)
#9    Object "/vg/bin/vg", at 0x15f4e0d, in void gbwt::loadVector<char>(std::vector<char, std::allocator<char> >&, std::istream&)
#8    Object "/vg/bin/vg", at 0x15f4b32, in std::vector<char, std::allocator<char> >::_M_default_append(unsigned long)
#7    Object "/vg/bin/vg", at 0x1b83f0d, in handleOOM(unsigned long, bool)
#6    Object "/vg/bin/vg", at 0x5be2d5, in std::__throw_bad_alloc()
#5    Object "/vg/bin/vg", at 0x1e239a8, in __cxa_throw
#4    Object "/vg/bin/vg", at 0x1e23846, in std::terminate()
#3    Object "/vg/bin/vg", at 0x1e237db, in __cxxabiv1::__terminate(void (*)())
#2    Object "/vg/bin/vg", at 0x5bc60a, in __gnu_cxx::__verbose_terminate_handler() [clone .cold]
#1    Object "/vg/bin/vg", at 0x5befa7, in abort
#0    Object "/vg/bin/vg", at 0x14a4c5b, in raise

5. What data and command can the vg dev team use to make the problem happen?

singularity exec -B /home -B /scratch /home/asherrar/tools/vg_1.47.0.sif vg giraffe -p -x /scratch/asherrar/pangenome_mc/hprc-v1.0-mc-chm13.xg -g /scratch/asherrar/pangenome_mc/hprc-v1.0-mc-chm13.gg -H /scratch/asherrar/pangenome_mc/hprc-v1.0-mc-chm13.gbwt -m /scratch/asherrar/pangenome_mc/hprc-v1.0-mc-chm13.min -d /scratch/asherrar/pangenome_mc/hprc-v1.0-mc-chm13.dist -t 16 -f /scratch/asherrar/hg002_shortreads/HG002_R1.fq.gz -f /scratch/asherrar/hg002_shortreads/HG002_R2.fq.gz

Attempted to run via SLURM and Singularity, providing 16 CPU cores and several memory values up to 400 GB. The command itself mirrors what the paper did (though they used Giraffe 1.39.0).

All files except the reads were pulled from the HPRC pangenome data freeze. Reads are Illumina 2x150 paired-end ~45x coverage short reads pulled from GIAB.

6. What does running vg version say?

vg version v1.47.0 "Ostuni"
Compiled with g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 on Linux
Linked against libstd++ 20210601
Built by root@buildkitsandbox

Testing with 1.48.0 next just to see if it makes a difference - just happened to have 1.47.0 installed already.

asherrar commented 1 year ago

1.48.0 provided basically the same error:

Preparing Indexes
[IndexRegistry]: Combining Giraffe GBWT and GBWTGraph into GBZ.
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
━━━━━━━━━━━━━━━━━━━━
Crash report for vg v1.48.0 "Gallipoli"
Stack trace (most recent call last):
#28   Object "/vg/bin/vg", at 0x5f0e1d, in _start
#27   Object "/vg/bin/vg", at 0x1ed4bcf, in __libc_start_main
#26   Object "/vg/bin/vg", at 0x5c0abe, in main
#25   Object "/vg/bin/vg", at 0xd2f24b, in vg::subcommand::Subcommand::operator()(int, char**) const
#24   Object "/vg/bin/vg", at 0xd70055, in main_giraffe(int, char**)
#23   Object "/vg/bin/vg", at 0x12d29de, in vg::IndexRegistry::make_indexes(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)
#22   Object "/vg/bin/vg", at 0x12bd698, in vg::IndexRegistry::execute_recipe(std::pair<std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, unsigned long> const&, vg::IndexingPlan const*, vg::AliasGraph&)
#21   Object "/vg/bin/vg", at 0x12af6dd, in std::_Function_handler<std::vector<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::allocator<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > > (std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&), vg::VGIndexes::get_vg_index_registry()::{lambda(std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)#56}>::_M_invoke(std::_Any_data const&, std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*&&, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)
#20   Object "/vg/bin/vg", at 0x12af4c0, in vg::VGIndexes::get_vg_index_registry()::{lambda(std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)#56}::operator()(std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&) const [clone .isra.0]
#19   Object "/vg/bin/vg", at 0x129e95a, in vg::load_gbz(gbwtgraph::GBZ&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)
#18   Object "/vg/bin/vg", at 0x129e7a2, in vg::load_gbwtgraph(gbwtgraph::GBWTGraph&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)
#17   Object "/vg/bin/vg", at 0x129fd88, in std::unique_ptr<gbwtgraph::GBWTGraph, std::default_delete<gbwtgraph::GBWTGraph> > vg::io::VPKG::load_one<gbwtgraph::GBWTGraph>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
#16   Object "/vg/bin/vg", at 0xdc013e, in std::_Function_handler<void (std::istream&), vg::io::VPKG::try_load_one<gbwtgraph::GBWTGraph>(std::istream&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)::{lambda(std::istream&)#1}>::_M_invoke(std::_Any_data const&, std::istream&)
#15   Object "/vg/bin/vg", at 0x80883a, in vg::io::VPKG::with_putback(std::istream&, std::function<void (std::istream&)> const&)
#14   Object "/vg/bin/vg", at 0xc8a0a6, in std::_Function_handler<void (std::istream&), vg::io::VPKG::try_load_bare<gbwtgraph::GBWTGraph>(std::istream&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)::{lambda(std::istream&)#1}>::_M_invoke(std::_Any_data const&, std::istream&)
#13   Object "/vg/bin/vg", at 0x143975f, in std::_Function_handler<void* (std::istream&), vg::io::register_loader_saver_gbwtgraph()::{lambda(std::istream&)#1}>::_M_invoke(std::_Any_data const&, std::istream&)
#12   Object "/vg/bin/vg", at 0x18243ee, in handlegraph::Serializable::deserialize(std::istream&)
#11   Object "/vg/bin/vg", at 0x1538ecb, in gbwtgraph::GBWTGraph::deserialize_members(std::istream&)
#10   Object "/vg/bin/vg", at 0x15d4f89, in gbwt::StringArray::load(std::istream&)
#9    Object "/vg/bin/vg", at 0x15e51cd, in void gbwt::loadVector<char>(std::vector<char, std::allocator<char> >&, std::istream&)
#8    Object "/vg/bin/vg", at 0x15e4ef2, in std::vector<char, std::allocator<char> >::_M_default_append(unsigned long)
#7    Object "/vg/bin/vg", at 0x1b71d9d, in handleOOM(unsigned long, bool)
#6    Object "/vg/bin/vg", at 0x5bf315, in std::__throw_bad_alloc()
#5    Object "/vg/bin/vg", at 0x1e10c48, in __cxa_throw
#4    Object "/vg/bin/vg", at 0x1e10ae6, in std::terminate()
#3    Object "/vg/bin/vg", at 0x1e10a7b, in __cxxabiv1::__terminate(void (*)())
#2    Object "/vg/bin/vg", at 0x5bd64a, in __gnu_cxx::__verbose_terminate_handler() [clone .cold]
#1    Object "/vg/bin/vg", at 0x5bffe7, in abort
#0    Object "/vg/bin/vg", at 0x149514b, in raise
ERROR: Signal 6 occurred. VG has crashed. Visit https://github.com/vgteam/vg/issues/new/choose to report a bug.
Please include this entire error log in your bug report!
━━━━━━━━━━━━━━━━━━━━

Edit: Grabbed 1.39.0, the version they mention in the paper, and even that doesn't seem to work with 240 GB memory. All I can think of is something about the CHM13 index files is wrong, but I don't know where to start to figure that out.

jltsiren commented 1 year ago

It could be that your .gg file is corrupted. It should be the following:

% md5sum hprc-v1.0-mc-chm13.gg
84b7f67f8e1c653c9f0209d34721f9ef  hprc-v1.0-mc-chm13.gg

The crash happens due to an out-of-memory error during loading a byte vector from disk, so the most likely explanation is that the length of the vector read from the file is incorrect.

asherrar commented 1 year ago

Got the same hash for the same file.

Edit: Here's the md5 for all the relevant files from the HPRC freeze:

84b7f67f8e1c653c9f0209d34721f9ef  pangenome_mc/hprc-v1.0-mc-chm13.gg
ca6f3028efecfc8aed3f807004746723  pangenome_mc/hprc-v1.0-mc-chm13.gbwt
ba65c40605494308773e2b97407cc74c  pangenome_mc/hprc-v1.0-mc-chm13.xg
907fc1fda27a0c5aabcb1f22010cc71c  pangenome_mc/hprc-v1.0-mc-chm13.min
b90a2fefe83993948a9065a37c9dd9b1  pangenome_mc/hprc-v1.0-mc-chm13.dist
jltsiren commented 1 year ago

All the index files appear to be correct. Do you get the same hashes when you run md5sum the same way you are running vg giraffe?

The part where you get the crash should take ~14 GB memory, and the entire Giraffe run with those indexes should take ~65 GB. Are you sure you have enough memory on the system you are running the command?

asherrar commented 1 year ago

Yes, I'm getting the same hashes. And as far as I can tell, it should have the memory properly allocated - I was running it as a SLURM job interactively via salloc, and I just tested it again with 16 cores and 120 GB with the same error popping up. I'm trying again with a batch job rather than an interactive one just in case there's something weird with the allocation, as well as having it check the hashes again that way.

Side thought, is it possible that the way I'm using vg via Singularity could be part of the issue?

asherrar commented 1 year ago

...okay, it looks like it's just some quirk of the architecture on the computing cluster I'm using, as soon as I ran it via sbatch instead of salloc it worked. This makes zero sense. Sorry for the hassle.