vgteam / vg

tools for working with genome variation graphs
https://biostars.org/tag/vg/
Other
1.07k stars 191 forks source link

`vg autoindex` fails on GFA file containing 20k SARS-CoV-2 sequences. #4226

Closed TheHarshShow closed 4 months ago

TheHarshShow commented 4 months ago

1. What were you trying to do?

Create indexes for mapping reads onto GFA containing 20k SARS-CoV-2 sequences.

2. What actually happened?

It crashed when creating the distance matrix possibly due to some overflows. Here is the terminal output:

[vg autoindex] Executing command: vg autoindex --workflow giraffe -g sars_20000.gfa -p sars_20000_pggb_2024
[IndexRegistry]: Checking for haplotype lines in GFA.
[IndexRegistry]: Constructing VG graph from GFA input.
[IndexRegistry]: Constructing XG graph from VG graph.
[IndexRegistry]: Constructing a greedy path cover GBWT
[IndexRegistry] forked child 806503
[IndexRegistry]: Constructing GBZ using NamedNodeBackTranslation.
[IndexRegistry]: Constructing distance index for Giraffe.
terminate called after throwing an instance of 'std::invalid_argument'
  what():  Need 28 bits to represent value 134217728 but only have 27
ERROR: Signal 6 occurred. VG has crashed. Visit https://github.com/vgteam/vg/issues/new/choose to report a bug.
Stack trace path: /tmp/vg_crash_zeTXt2/stacktrace.txt
Please include the stack trace file in your bug report!

3. If you got a line like Stack trace path: /somewhere/on/your/computer/stacktrace.txt, please copy-paste the contents of that file here:

Crash report for vg v1.45.0-60-g5012893ab "Alpicella"
Stack trace (most recent call last):
#20   Object "", at 0xffffffffffffffff, in 
#19   Object "/home/AD.UCSD.EDU/hmotwani/vg/bin/vg", at 0x55a463c509ad, in _start
#18   Object "/usr/lib/x86_64-linux-gnu/libc-2.31.so", at 0x7f914249e082, in __libc_start_main
      Source "../csu/libc-start.c", line 308, in __libc_start_main [0x7f914249e082]
#17   Object "/home/AD.UCSD.EDU/hmotwani/vg/bin/vg", at 0x55a463c22159, in main
      Source "src/main.cpp", line 124, in main [0x55a463c22159]
        121:         if (subcommand->get_category() == vg::subcommand::CommandCategory::DEPRECATED) {
        122:             cerr << endl << "WARNING:[vg] Subcommand '" << argv[1] << "' is deprecated and is no longer being actively maintained. Future releases may eliminate it entirely." << endl << endl;
        123:         }
      > 124:         return (*subcommand)(argc, argv);
        125:     } else {
        126:         // No subcommand found
        127:         string command = argv[1];
#16   Object "/home/AD.UCSD.EDU/hmotwani/vg/bin/vg", at 0x55a4643ddb9b, in vg::subcommand::Subcommand::operator()(int, char**) const
    | Source "src/subcommand/subcommand.cpp", line 75, in operator()
    |    74: const int Subcommand::operator()(int argc, char** argv) const {
    | >  75:     return main_function(argc, argv);
    |    76: }
      Source "/usr/include/c++/10/bits/std_function.h", line 622, in operator() [0x55a4643ddb9b]
        619:     {
        620:       if (_M_empty())
        621:    __throw_bad_function_call();
      > 622:       return _M_invoker(_M_functor, std::forward<_ArgTypes>(__args)...);
        623:     }
        624: 
        625: #if __cpp_rtti
#15   Object "/home/AD.UCSD.EDU/hmotwani/vg/bin/vg", at 0x55a46433958e, in main_autoindex(int, char**)
      Source "src/subcommand/autoindex_main.cpp", line 355, in main_autoindex [0x55a46433958e]
        352:     targets.resize(unique(targets.begin(), targets.end()) - targets.begin());
        353:     
        354:     try {
      > 355:         registry.make_indexes(targets);
        356:     }
        357:     catch (InsufficientInputException ex) {
        358:         cerr << "error:[vg autoindex] Input is not sufficient to create indexes" << endl;
#14   Object "/home/AD.UCSD.EDU/hmotwani/vg/bin/vg", at 0x55a464975d1b, in vg::IndexRegistry::make_indexes(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)
      Source "src/index_registry.cpp", line 4006, in make_indexes [0x55a464975d1b]
       4004:         // do the recipe
       4005:         try {
      >4006:             auto recipe_results = execute_recipe(step, &plan, alias_graph);
       4007:             
       4008:             // the recipe executed successfully
       4009:             assert(recipe_results.size() == step.first.size());
#13   Object "/home/AD.UCSD.EDU/hmotwani/vg/bin/vg", at 0x55a464962ad8, in vg::IndexRegistry::execute_recipe(std::pair<std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, unsigned long> const&, vg::IndexingPlan const*, vg::AliasGraph&)
    | Source "src/index_registry.cpp", line 4956, in execute
    |  4954:     cerr << "executing recipe " << recipe_name.second << " for " << to_string(recipe_name.first) << endl;
    |  4955: #endif
    | >4956:     return index_recipe.execute(plan, alias_graph, recipe_name.first);;
    |  4957: }
    | Source "src/index_registry.cpp", line 5121, in operator()
    |  5119: vector<vector<string>> IndexRecipe::execute(const IndexingPlan* plan, AliasGraph& alias_graph,
    |  5120:                                             const IndexGroup& constructing) const {
    | >5121:     return exec(inputs, plan, alias_graph, constructing);
    |  5122: }
      Source "/usr/include/c++/10/bits/std_function.h", line 622, in execute_recipe [0x55a464962ad8]
        619:     {
        620:       if (_M_empty())
        621:    __throw_bad_function_call();
      > 622:       return _M_invoker(_M_functor, std::forward<_ArgTypes>(__args)...);
        623:     }
        624: 
        625: #if __cpp_rtti
#12   Object "/home/AD.UCSD.EDU/hmotwani/vg/bin/vg", at 0x55a46497abbd, in std::_Function_handler<std::vector<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::allocator<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > > (std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&), vg::VGIndexes::get_vg_index_registry()::{lambda(std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)#52}>::_M_invoke(std::_Any_data const&, std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*&&, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)
    | Source "/usr/include/c++/10/bits/std_function.h", line 292, in __invoke_r<std::vector<std::vector<std::__cxx11::basic_string<char> > >, vg::VGIndexes::get_vg_index_registry()::<lambda(const std::vector<const vg::IndexFile*>&, const vg::IndexingPlan*, vg::AliasGraph&, const IndexGroup&)>&, const std::vector<const vg::IndexFile*, std::allocator<const vg::IndexFile*> >&, const vg::IndexingPlan*, vg::AliasGraph&, const std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&>
    |   290:       {
    |   291:    return std::__invoke_r<_Res>(*_Base::_M_get_pointer(__functor),
    | > 292:                     std::forward<_ArgTypes>(__args)...);
    |   293:       }
    |   294:     };
    | Source "/usr/include/c++/10/bits/invoke.h", line 142, in __invoke_impl<std::vector<std::vector<std::__cxx11::basic_string<char> > >, vg::VGIndexes::get_vg_index_registry()::<lambda(const std::vector<const vg::IndexFile*>&, const vg::IndexingPlan*, vg::AliasGraph&, const IndexGroup&)>&, const std::vector<const vg::IndexFile*, std::allocator<const vg::IndexFile*> >&, const vg::IndexingPlan*, vg::AliasGraph&, const std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&>
    |   140:       using __tag = typename __result::__invoke_type;
    |   141:       return std::__invoke_impl<__type>(__tag{}, std::forward<_Callable>(__fn),
    | > 142:                    std::forward<_Args>(__args)...);
    |   143:     }
      Source "/usr/include/c++/10/bits/invoke.h", line 60, in _M_invoke [0x55a46497abbd]
         57:   template<typename _Res, typename _Fn, typename... _Args>
         58:     constexpr _Res
         59:     __invoke_impl(__invoke_other, _Fn&& __f, _Args&&... __args)
      >  60:     { return std::forward<_Fn>(__f)(std::forward<_Args>(__args)...); }
         61: 
         62:   template<typename _Res, typename _MemFun, typename _Tp, typename... _Args>
         63:     constexpr _Res
#11   Object "/home/AD.UCSD.EDU/hmotwani/vg/bin/vg", at 0x55a46497aa84, in vg::VGIndexes::get_vg_index_registry()::{lambda(std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)#52}::operator()(std::vector<vg::IndexFile const*, std::allocator<vg::IndexFile const*> > const&, vg::IndexingPlan const*, vg::AliasGraph&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&) const [clone .constprop.0] [clone .isra.0]
      Source "src/index_registry.cpp", line 3571, in operator() [0x55a46497aa84]
       3568:         init_in(infile_gbz, gbz_filename);
       3569:         unique_ptr<gbwtgraph::GBZ> gbz = vg::io::VPKG::load_one<gbwtgraph::GBZ>(infile_gbz);
       3570:         
      >3571:         return make_distance_index(gbz->graph, plan, constructing);
       3572:     });
       3573:     
       3574:     registry.register_recipe({"Spliced Distance Index"}, {"Spliced XG"},
#10   Object "/home/AD.UCSD.EDU/hmotwani/vg/bin/vg", at 0x55a46497a879, in vg::VGIndexes::get_vg_index_registry()::{lambda(handlegraph::HandleGraph const&, vg::IndexingPlan const*, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)#51}::operator()(handlegraph::HandleGraph const&, vg::IndexingPlan const*, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&) const [clone .constprop.0]
      Source "src/index_registry.cpp", line 3546, in operator() [0x55a46497a879]
       3544:         SnarlDistanceIndex distance_index;
       3545:         IntegratedSnarlFinder snarl_finder(graph);
      >3546:         fill_in_distance_index(&distance_index, &graph, &snarl_finder);
       3547:         distance_index.serialize(output_name);
       3548:         
       3549:         output_names.push_back(output_name);
#9    Object "/home/AD.UCSD.EDU/hmotwani/vg/bin/vg", at 0x55a46452525f, in vg::fill_in_distance_index(bdsg::SnarlDistanceIndex*, handlegraph::HandleGraph const*, vg::HandleGraphSnarlFinder const*, unsigned long)
      Source "src/snarl_distance_index.cpp", line 37, in fill_in_distance_index [0x55a46452525f]
         34:     //And fill in the permanent distance index
         35:     vector<const SnarlDistanceIndex::TemporaryDistanceIndex*> indexes;
         36:     indexes.emplace_back(&temp_index);
      >  37:     distance_index->get_snarl_tree_records(indexes, graph);
         38: }
         39: SnarlDistanceIndex::TemporaryDistanceIndex make_temporary_distance_index(
         40:     const HandleGraph* graph, const HandleGraphSnarlFinder* snarl_finder, size_t size_limit)  {
#8    Object "/home/AD.UCSD.EDU/hmotwani/vg/bin/vg", at 0x55a464e5495c, in bdsg::SnarlDistanceIndex::get_snarl_tree_records(std::vector<bdsg::SnarlDistanceIndex::TemporaryDistanceIndex const*, std::allocator<bdsg::SnarlDistanceIndex::TemporaryDistanceIndex const*> > const&, handlegraph::HandleGraph const*)
      Source "bdsg/src/snarl_distance_index.cpp", line 5942, in get_snarl_tree_records [0x55a464e5495c]
#7    Object "/home/AD.UCSD.EDU/hmotwani/vg/bin/vg", at 0x55a464e51c10, in bdsg::SnarlDistanceIndex::ChainRecordWriter::add_simple_snarl(unsigned long, bdsg::SnarlDistanceIndex::record_t, unsigned long)
    | Source "bdsg/src/snarl_distance_index.cpp", line 5117, in SimpleSnarlRecordWriter
    | Source "bdsg/src/snarl_distance_index.cpp", line 4092, in set_node_count
    | Source "bdsg/src/snarl_distance_index.cpp", line 4118, in operator=
      Source "bdsg/include/bdsg/internal/mapped_structs.hpp", line 1817, in add_simple_snarl [0x55a464e51c10]
#6    Object "/home/AD.UCSD.EDU/hmotwani/vg/bin/vg", at 0x55a464dec35f, in bdsg::CompatIntVector<bdsg::yomo::Allocator<unsigned long> >::pack(unsigned long, unsigned long, unsigned long)
      Source "bdsg/include/bdsg/internal/mapped_structs.hpp", line 1782, in pack [0x55a464dec35f]
#5    Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.32", at 0x7f914279b257, in __cxa_throw
#4    Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.32", at 0x7f914279aff6, in std::terminate()
#3    Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.32", at 0x7f914279af8b, in 
#2    Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.32", at 0x7f9142788ee5, in 
#1    Object "/usr/lib/x86_64-linux-gnu/libc-2.31.so", at 0x7f914249c858, in abort
      Source "/build/glibc-wuryBv/glibc-2.31/stdlib/abort.c", line 79, in abort [0x7f914249c858]
#0    Object "/usr/lib/x86_64-linux-gnu/libc-2.31.so", at 0x7f91424bd00b, in raise
      Source "../sysdeps/unix/sysv/linux/raise.c", line 51, in raise [0x7f91424bd00b]

4. What does running vg version say?

vg version v1.45.0-60-g5012893ab "Alpicella"
Compiled with g++ (Ubuntu 10.5.0-1ubuntu1~20.04) 10.5.0 on Linux
Linked against libstd++ 20230707
jltsiren commented 4 months ago

You should try the latest version of vg first to see if it fixes the issue. The current version is 1.54.0, and you were using version 1.45.0, which is over a year old.