vgteam / vg

tools for working with genome variation graphs
https://biostars.org/tag/vg/
Other
1.1k stars 194 forks source link

Error for " vg convert -x" #4099

Closed linsindian closed 11 months ago

linsindian commented 1 year ago

1. What were you trying to do?

I want to use vg surject to convert the GAM file into the BAM file format,so I need to use convert to generate the xg graph.

2. What did you want to happen?

I mapped the long reads of hg002 back to hprc-mc-grch38 using GraphAligner to generate a GAM file and attempted to convert the GAM file to the BAM file format using vg surject. Therefore, I first need to use vg convert to generate an xg file for hprc-mc-grch38 graph. By using the command vg convert -p -t 16 hprc-v1.1-mc-grch38.gfa > hprc-v1.1-mc-grch38.vg, I generated a .vg graph file. However, when I attempted to use vg convert -x -t 16 hprc-v1.1-mc-grch38vg > hprc-v1.1-mc-grch38.xg, I encountered an issue.

3. What actually happened?

vg: /public/home/anovak/build/vg/include/sdsl/int_vector.hpp:1436: sdsl:: int_vector< >::const_reference sdsl::int_vector< >: :operator[](const size_type&) const [with unsigned char t_width = 0; sdsl ::int_vector< >::const_reference = long unsigned int; sdsl::in t_vector< >::size_type = long unsigned int]: Assertion `idx < this->size()' failed. ━━━━━━━━━━━━━━━━━━━━ Crash report for vg v1.51.0 "Quellenhof" Stack trace (most recent call last):

13 Object "/bip7_disk/sindian111/pangenome/vg", at 0x6169f4, in _start

12 Object "/bip7_disk/sindian111/pangenome/vg", at 0x2068296, in __libc_start_main

11 Object "/bip7_disk/sindian111/pangenome/vg", at 0x2066a39, in __libc_start_call_main

10 Object "/bip7_disk/sindian111/pangenome/vg", at 0xdf980b, in vg::subcommand::Subcommand::operator()(int, char**) const

9 Object "/bip7_disk/sindian111/pangenome/vg", at 0xcb33cb, in main_convert(int, char**)

8 Object "/bip7_disk/sindian111/pangenome/vg", at 0xcb09b4, in graph_to_xg_adjusting_paths(handlegraph::PathHandleGraph const, xg::XG, std::unordered_set<std::cxx11::basic_string<char, std::char_traits, std::allocator >, std::hash<std::cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::cxx11::basic_string<char, std::char_traits, std::allocator > > > const&, bool)

7 Object "/bip7_disk/sindian111/pangenome/vg", at 0x18950da, in xg::XG::from_enumerators(std::function<void (std::function<void (std::cxx11::basic_string<char, std::char_traits, std::allocator > const&, long long const&)> const&)> const&, std::function<void (std::function<void (long long const&, bool const&, long long const&, bool const&)> const&)> const&, std::function<void (std::function<void (std::cxx11::basic_string<char, std::char_traits, std::allocator > const&, long long const&, bool const&, std::cxx11::basic_string<char, std::char_traits, std::allocator > const&, bool const&, bool const&)> const&)> const&, bool, std::cxx11::basic_string<char, std::char_traits,std::allocator >)

6 Object "/bip7_disk/sindian111/pangenome/vg", at 0x188e047, in xg::XG::index_node_to_path(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&)

5 Object "/bip7_disk/sindian111/pangenome/vg", at 0x18964fb, in sdsl::int_vector<(unsigned char)0>::operator[](unsigned long const&) const

4 Object "/bip7_disk/sindian111/pangenome/vg", at 0x2078e65, in __assert_fail

3 Object "/bip7_disk/sindian111/pangenome/vg", at 0x5e5673, in __assert_fail_base.cold

2 Object "/bip7_disk/sindian111/pangenome/vg", at 0x5e574b, in abort

1 Object "/bip7_disk/sindian111/pangenome/vg", at 0x207f475, in raise

0 Object "/bip7_disk/sindian111/pangenome/vg", at 0x20abe7c, in __pthread_kill

ERROR: Signal 6 occurred. VG has crashed. Visit https://github.com/vgteam/vg/issues/new/choose to report a bug. Please include this entire error log in your bug report!

5. What data and command can the vg dev team use to make the problem happen?

The pan-genome graph I'm using is sourced from hprc.

The two step to generate the xg file. vg convert -p -t 16 hprc-v1.1-mc-grch38.gfa > hprc-v1.1-mc-grch38.vg vg convert -x -t 16 hprc-v1.1-mc-grch38vg > hprc-v1.1-mc-grch38.xg

6. What does running vg version say?

vg version v1.51.0 "Quellenhof" Compiled with g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 on Linux Linked against libstd++ 20230528 Built by anovak@courtyard.gi.ucsc.edu

Thank you for reading. Looking forward to your response.

jeizenga commented 1 year ago

I notice that hprc-v1.1-mc-grch38vg is missing the . before the finle extension. Is that also true in original command, or is that an error from copying the commands to this report?

linsindian commented 1 year ago

That is just an mistake from copying the command to here. In the actual command, the file "hprc-v1.1-mc-grch38.vg" is indeed being used correctly.

glennhickey commented 1 year ago

I was able to run these commands without issue with vg 1.51.0. According to /usr/bin/time -v, they took:

convert -p : 6h22m and 123G RAM convert -x : 2h10m and 368G RAM

The resulting files have these sizes

-rw-r--r-- 1 hickey cgl 48118373527 Sep 27 11:30 hprc-v1.1-mc-grch38.gfa
-rw-r--r-- 1 hickey cgl 42775622374 Sep 27 17:54 hprc-v1.1-mc-grch38.vg
-rw-r--r-- 1 hickey cgl 71819679419 Sep 27 20:05 hprc-v1.1-mc-grch38.xg

How much RAM do you have? How much free disk?

jeizenga commented 11 months ago

I'm going to close this issue since we haven't heard any more news on our end. Let me know if this is not correct.