vgteam / sequenceTubeMap

displays multiple genomic sequences in the form of a tube map
MIT License
182 stars 26 forks source link

issue when visualizing paths other than reference #462

Closed Overcraft90 closed 4 weeks ago

Overcraft90 commented 1 month ago

I was working with a small plant pangenome in GBZ format. It's loaded as a mounted file in TubeMap and I can navigate reference paths with no issues; however, when I attempt to switch to a non-refernce path I'm prompted with a vg chunk error as follows:

vg chunk failed: terminate called after throwing an instance of 'std::out_of_range' what(): unordered_map::at ━━━━━━━━━━━━━━━━━━━━ Crash report for vg v1.60.0-9-g9eaa4a454 "Annicco" Stack trace (most recent call last): #16 Object "", at 0xffffffffffffffff, in #15 Object "/usr/local/bin/vg", at 0x5c5e058f9884, in _start #14 Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x77e24fa2a28a, in libc_start_main@@GLIBC_2.34 Source "../csu/libc-start.c", line 360, in libc_start_main_impl [0x77e24fa2a28a] #13 Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x77e24fa2a1c9, in libc_start_call_main Source "../sysdeps/nptl/libc_start_call_main.h", line 58, in libc_start_call_main [0x77e24fa2a1c9] #12 Object "/usr/local/bin/vg", at 0x5c5e060cf28b, in vg::subcommand::Subcommand::operator()(int, char**) const | Source "src/subcommand/subcommand.cpp", line 75, in operator() Source "/usr/include/c++/13/bits/std_function.h", line 591, in operator() [0x5c5e060cf28b] 588: { 589: if (_M_empty()) 590: throw_bad_function_call(); > 591: return _M_invoker(_M_functor, std::forward<_ArgTypes>(args)...); 592: } 593: 594: #if cpp_rtti #11 Object "/usr/local/bin/vg", at 0x5c5e05f586f2, in main_chunk(int, char**) | Source "src/subcommand/chunk_main.cpp", line 606, in operator() Source "/usr/include/c++/13/bits/std_function.h", line 591, in main_chunk [0x5c5e05f586f2] 588: { 589: if (_M_empty()) 590: throw_bad_function_call(); > 591: return _M_invoker(_M_functor, std::forward<_ArgTypes>(args)...); 592: } 593: 594: #if cpp_rtti #10 Object "/usr/local/bin/vg", at 0x5c5e05f4e132, in std::_Function_handler<unsigned long (std::cxx11::basic_string<char, std::char_traits, std::allocator > const&), main_chunk(int, char**)::{lambda(std::cxx11::basic_string<char, std::char_traits, std::allocator > const&)#1}>::_M_invoke(std::_Any_data const&, std::cxx11::basic_string<char, std::char_traits, std::allocator > const&) | Source "/usr/include/c++/13/bits/std_function.h", line 290, in invoke_r<long unsigned int, main_chunk(int, char*)::<lambda(const std::string&)>&, const std::cxx11::basic_string<char, std::char_traits, std::allocator >&> | 288: _M_invoke(const _Any_data& functor, _ArgTypes&&... args) | 289: { | > 290: return std::invoke_r<_Res>(_Base::_M_get_pointer(functor), | 291: std::forward<_ArgTypes>(args)...); | 292: } | Source "/usr/include/c++/13/bits/invoke.h", line 138, in invoke_impl<long unsigned int, main_chunk(int, char**)::<lambda(const std::string&)>&, const std::cxx11::basic_string<char, std::char_traits, std::allocator >&> | 136: #endif | 137: using tag = typename result::invoke_type; | > 138: return std::invoke_impl<__type>(tag{}, std::forward<_Callable>(__fn), | 139: std::forward<_Args>(args)...); | 140: } | Source "/usr/include/c++/13/bits/invoke.h", line 61, in operator() | 59: constexpr _Res | 60: invoke_impl(invoke_other, _Fn&& f, _Args&&... args) | > 61: { return std::forward<_Fn>(f)(std::forward<_Args>(__args)...); } | 62: | 63: template<typename _Res, typename _MemFun, typename _Tp, typename... _Args> Source "src/subcommand/chunk_main.cpp", line 591, in _M_invoke [0x5c5e05f4e132] #9 Object "/home/mat/vg/lib/libhandlegraph.so", at 0x77e25051ec7d, in handlegraph::PathForEachSocket::end() const Source "/home/mat/Documents/vg/deps/libhandlegraph/src/path_handle_graph.cpp", line 54, in end [0x77e25051ec7d] #8 Object "/usr/local/bin/vg", at 0x5c5e06b3d67b, in bdsg::ReferencePathOverlay::path_end(handlegraph::path_handle_t const&) const | Source "bdsg/src/reference_path_overlay.cpp", line 394, in get_step_count | Source "bdsg/src/reference_path_overlay.cpp", line 372, in at | Source "/usr/include/c++/13/bits/unordered_map.h", line 1008, in at | 1006: const mapped_type& | 1007: at(const key_type& k) const | >1008: { return _M_h.at(k); } | 1009: ///@} Source "/usr/include/c++/13/bits/hashtable_policy.h", line 798, in path_end [0x5c5e06b3d67b] 795: { 796: auto ite = static_cast<const hashtable*>(this)->find(k); 797: if (!ite._M_cur) > 798: throw_out_of_range(N("unordered_map::at")); 799: return __ite->second; 800: } 801: }; #7 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.33", at 0x77e24fea932c, in std::throw_out_of_range(char const*) #6 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.33", at 0x77e24febb390, in cxa_throw #5 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.33", at 0x77e24fea5a54, in std::terminate() #4 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.33", at 0x77e24febb0d9, in #3 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.33", at 0x77e24fea5ff4, in #2 Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x77e24fa288fe, in abort Source "./stdlib/abort.c", line 79, in abort [0x77e24fa288fe] #1 Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x77e24fa4526d, in raise Source "../sysdeps/posix/raise.c", line 26, in raise [0x77e24fa4526d] #0 Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x77e24fa9eb1c, in pthread_kill@@GLIBC_2.34 | Source "./nptl/pthread_kill.c", line 89, in pthread_kill_internal | Source "./nptl/pthread_kill.c", line 78, in __pthread_kill_implementation Source "./nptl/pthread_kill.c", line 44, in __pthread_kill [0x77e24fa9eb1c] ERROR: Signal 6 occurred. VG has crashed. Visit https://github.com/vgteam/vg/issues/new/choose to report a bug. Please include this entire error log in your bug report! ━━━━━━━━━━━━━━━━━━━━

I'm not quite sure to what this might be related... For reference, other haplotypes do exist as per the drop menu in the Region field, but the Legend shows only AMIGA as the reference genome (see screenshot below). Any help is much appreciated, thanks in advance! Screenshot from 2024-10-23 20-08-32

adamnovak commented 1 month ago

It looks like the ReferencePathOverlay thinks about a path that it can't actually get a length for and explodes. We need some kind of check in vg chunk to detect when it doesn't actually know about the path you are asking about. The problem might be that this path exists, but isn't indexed in the ReferencePathOverlay so the lookup that vg chunk is trying to do errors out.

Are you trying to seek to coordinates along a particular haplotype from the GBWT data and draw a visualization for it?

I think we could sort of dynamically promote it to be reference indexed when vg chunk sees you are asking about it, since GBWT/GBZ/the ReferencePathOverlay just scan all the reference paths at load time to provide efficient position lookups.

adamnovak commented 1 month ago

OK, I can reproduce this with just vg.

If I make a GBZ:

vg gbwt --gbz-format --graph-name test/graph.gbz --gfa-input test/graphs/gfa_with_reference.gfa

And then ask for a path that doesn't exist:

vg chunk -x test/graph.gbz -p GRCh38#0#chr2:1-2 -c 1 >/dev/null

I get a legitimate error:

error[vg chunk]: input path GRCh38#0#chr2 not found in xg index

But if I ask for a path that does exist but isn't a reference path:

vg chunk -x test/graph.gbz -p sample1#1#chr1#0:1-2 -c 1 >/dev/null

Then it crashes:

libc++abi: terminating due to uncaught exception of type std::out_of_range: unordered_map::at: key not found
...