Closed asherrar closed 1 year ago
1.48.0 provided basically the same error:
Preparing Indexes
[IndexRegistry]: Combining Giraffe GBWT and GBWTGraph into GBZ.
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
━━━━━━━━━━━━━━━━━━━━
Crash report for vg v1.48.0 "Gallipoli"
Stack trace (most recent call last):
#28 Object "/vg/bin/vg", at 0x5f0e1d, in _start
#27 Object "/vg/bin/vg", at 0x1ed4bcf, in __libc_start_main
#26 Object "/vg/bin/vg", at 0x5c0abe, in main
#25 Object "/vg/bin/vg", at 0xd2f24b, in vg::subcommand::Subcommand::operator()(int, char**) const
#24 Object "/vg/bin/vg", at 0xd70055, in main_giraffe(int, char**)
#23 Object "/vg/bin/vg", at 0x12d29de, in vg::IndexRegistry::make_indexes(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)
#22 Object "/vg/bin/vg", at 0x12bd698, in vg::IndexRegistry::execute_recipe(std::pair<std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, unsigned long> const&, vg::IndexingPlan const*, vg::AliasGraph&)
#21 Object "/vg/bin/vg", at 0x12af6dd, in std::_Function_handler<std::vector<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::allocator<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > > (std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&), vg::VGIndexes::get_vg_index_registry()::{lambda(std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)#56}>::_M_invoke(std::_Any_data const&, std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*&&, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)
#20 Object "/vg/bin/vg", at 0x12af4c0, in vg::VGIndexes::get_vg_index_registry()::{lambda(std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)#56}::operator()(std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&) const [clone .isra.0]
#19 Object "/vg/bin/vg", at 0x129e95a, in vg::load_gbz(gbwtgraph::GBZ&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)
#18 Object "/vg/bin/vg", at 0x129e7a2, in vg::load_gbwtgraph(gbwtgraph::GBWTGraph&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)
#17 Object "/vg/bin/vg", at 0x129fd88, in std::unique_ptr<gbwtgraph::GBWTGraph, std::default_delete<gbwtgraph::GBWTGraph> > vg::io::VPKG::load_one<gbwtgraph::GBWTGraph>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
#16 Object "/vg/bin/vg", at 0xdc013e, in std::_Function_handler<void (std::istream&), vg::io::VPKG::try_load_one<gbwtgraph::GBWTGraph>(std::istream&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)::{lambda(std::istream&)#1}>::_M_invoke(std::_Any_data const&, std::istream&)
#15 Object "/vg/bin/vg", at 0x80883a, in vg::io::VPKG::with_putback(std::istream&, std::function<void (std::istream&)> const&)
#14 Object "/vg/bin/vg", at 0xc8a0a6, in std::_Function_handler<void (std::istream&), vg::io::VPKG::try_load_bare<gbwtgraph::GBWTGraph>(std::istream&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)::{lambda(std::istream&)#1}>::_M_invoke(std::_Any_data const&, std::istream&)
#13 Object "/vg/bin/vg", at 0x143975f, in std::_Function_handler<void* (std::istream&), vg::io::register_loader_saver_gbwtgraph()::{lambda(std::istream&)#1}>::_M_invoke(std::_Any_data const&, std::istream&)
#12 Object "/vg/bin/vg", at 0x18243ee, in handlegraph::Serializable::deserialize(std::istream&)
#11 Object "/vg/bin/vg", at 0x1538ecb, in gbwtgraph::GBWTGraph::deserialize_members(std::istream&)
#10 Object "/vg/bin/vg", at 0x15d4f89, in gbwt::StringArray::load(std::istream&)
#9 Object "/vg/bin/vg", at 0x15e51cd, in void gbwt::loadVector<char>(std::vector<char, std::allocator<char> >&, std::istream&)
#8 Object "/vg/bin/vg", at 0x15e4ef2, in std::vector<char, std::allocator<char> >::_M_default_append(unsigned long)
#7 Object "/vg/bin/vg", at 0x1b71d9d, in handleOOM(unsigned long, bool)
#6 Object "/vg/bin/vg", at 0x5bf315, in std::__throw_bad_alloc()
#5 Object "/vg/bin/vg", at 0x1e10c48, in __cxa_throw
#4 Object "/vg/bin/vg", at 0x1e10ae6, in std::terminate()
#3 Object "/vg/bin/vg", at 0x1e10a7b, in __cxxabiv1::__terminate(void (*)())
#2 Object "/vg/bin/vg", at 0x5bd64a, in __gnu_cxx::__verbose_terminate_handler() [clone .cold]
#1 Object "/vg/bin/vg", at 0x5bffe7, in abort
#0 Object "/vg/bin/vg", at 0x149514b, in raise
ERROR: Signal 6 occurred. VG has crashed. Visit https://github.com/vgteam/vg/issues/new/choose to report a bug.
Please include this entire error log in your bug report!
━━━━━━━━━━━━━━━━━━━━
Edit: Grabbed 1.39.0, the version they mention in the paper, and even that doesn't seem to work with 240 GB memory. All I can think of is something about the CHM13 index files is wrong, but I don't know where to start to figure that out.
It could be that your .gg
file is corrupted. It should be the following:
% md5sum hprc-v1.0-mc-chm13.gg
84b7f67f8e1c653c9f0209d34721f9ef hprc-v1.0-mc-chm13.gg
The crash happens due to an out-of-memory error during loading a byte vector from disk, so the most likely explanation is that the length of the vector read from the file is incorrect.
Got the same hash for the same file.
Edit: Here's the md5 for all the relevant files from the HPRC freeze:
84b7f67f8e1c653c9f0209d34721f9ef pangenome_mc/hprc-v1.0-mc-chm13.gg
ca6f3028efecfc8aed3f807004746723 pangenome_mc/hprc-v1.0-mc-chm13.gbwt
ba65c40605494308773e2b97407cc74c pangenome_mc/hprc-v1.0-mc-chm13.xg
907fc1fda27a0c5aabcb1f22010cc71c pangenome_mc/hprc-v1.0-mc-chm13.min
b90a2fefe83993948a9065a37c9dd9b1 pangenome_mc/hprc-v1.0-mc-chm13.dist
All the index files appear to be correct. Do you get the same hashes when you run md5sum
the same way you are running vg giraffe
?
The part where you get the crash should take ~14 GB memory, and the entire Giraffe run with those indexes should take ~65 GB. Are you sure you have enough memory on the system you are running the command?
Yes, I'm getting the same hashes. And as far as I can tell, it should have the memory properly allocated - I was running it as a SLURM job interactively via salloc
, and I just tested it again with 16 cores and 120 GB with the same error popping up. I'm trying again with a batch job rather than an interactive one just in case there's something weird with the allocation, as well as having it check the hashes again that way.
Side thought, is it possible that the way I'm using vg
via Singularity could be part of the issue?
...okay, it looks like it's just some quirk of the architecture on the computing cluster I'm using, as soon as I ran it via sbatch
instead of salloc
it worked. This makes zero sense. Sorry for the hassle.
1. What were you trying to do? Trying to replicate the Giraffe pipeline from A draft human pangenome reference using the CHM13 pangenome reference.
2. What did you want to happen? For vg giraffe to run successfully.
3. What actually happened?
4. If you got a line like
Stack trace path: /somewhere/on/your/computer/stacktrace.txt
, please copy-paste the contents of that file here:5. What data and command can the vg dev team use to make the problem happen?
Attempted to run via SLURM and Singularity, providing 16 CPU cores and several memory values up to 400 GB. The command itself mirrors what the paper did (though they used Giraffe 1.39.0).
All files except the reads were pulled from the HPRC pangenome data freeze. Reads are Illumina 2x150 paired-end ~45x coverage short reads pulled from GIAB.
6. What does running
vg version
say?Testing with 1.48.0 next just to see if it makes a difference - just happened to have 1.47.0 installed already.