vgteam / vg

tools for working with genome variation graphs
https://biostars.org/tag/vg/
Other
1.1k stars 194 forks source link

questions on mapping and calling #3206

Open cistarsa opened 3 years ago

cistarsa commented 3 years ago

Hello and thanks for the quick and helpful responses so far. I'm attempting to map paired-end reads to a variant graph while accounting for coding domains from a gene set (also, mapped to the graph).

I'm having issues when I try to generate a gaf using vg map. Although, I don't get these errors when I use these same files to generate a vcf...

## vg map --> gaf (errors):
./vg map -t 16 -x CPB_5_220.xg -g CPB_5_220.gcsa -F Lep_CDS.fa -% > OGS_cpb_5xg.gaf

## familiar error:
what (): Attempted to get handle for node 0 not present in graph

##vg map -> gam no errors
./vg map -t 16 -x CPB_5_220.xg -g CPB_5_220.gcsa -i -f CPBWGS_11_R1.fastq -f CPBWGS_11_R2.fasta > CPB_11_5x220.gam

I was able to generate a gaf file using different versions of xg and gcsa, but I can't decipher column 6:

./vg map -t 16 -x CPB_5_220.xg -g CPB_Feb5_1.30.gcsa -F Lep_CDS.fa -% > OGS_cpb_5xg.gaf

Is there a way to convert column 6 paths to scaffolds? eg: LDEC0001 528 0 528 + <2837272<592041 to LDEC0001 528 0 528 + <Lep_1:200-300<Lep_1:320-350

Finally, when I call variants and generate a vcf for a sample are the positions only with respect to reference or are there mappings to the snarls too? I'm only seeing with respect to reference and what does column 3, ID represent?

./vg call -a aug_CPB.xg -k aug_CPB.pack > aug_genos_CP11.vcf

# tail aug_genos_CP11.vcf
F_KS_scaff_11      2489      1494375_23523918         CG          CA 
glennhickey commented 3 years ago

The node 0 not present in graph error usually signals something wrong with the input. I'm not sure what you mean by fixing it by "using different versions of xg and gcsa". If you suspect it's a bug, and can share the original graph along with the indexing commands you used, we can take a look.

For GAF, we unfortunately do not yet support stable coordinates. It should be on our radar to add soon. I added it to our todo list

Column 3 out of vg call is the pair of node ids (separated by _) of nodes that define the bubble (snarl).

cistarsa commented 3 years ago

Thank you!

That's hugely helpful. So, I'm comparing the augmented.vcf to the bubble.bed generated in minigraph...when I look at the node below I can see that the starting position (column 2) and the last part of the node pair are the same, but how can I reconcile the starting node?

Regardless, this is more than enough to combine these results.

# aug.vcf:
**F_KS_P_RNA_scaffold_1   25397**   3635669_**46459**   CGACATGGTGGAAAAATTCACTATCTCAGA
# bubble.bed
**F_KS_P_RNA_scaffold_1 25397** 26442   9   5   0   6   1045    -1  -1  -1  
s46453,s684293,s46454,s622309,s46455,s46456,s46457,s46458,**s46459**

And yeah, I can reference this issue from Friday with the crash file I'm now getting when I try to generate a .gaf (after taking Jordan's advice)

Crash report for vg v1.30.0 "Carentino"
Stack trace (most recent call last) in thread 4590:
#16   Object "", at 0xffffffffffffffff, in 
#15   Object "/media/Summit/vg/vg", at 0x1d6e8c2, in __clone
#14   Object "/media/Summit/vg/vg", at 0x11f7288, in start_thread
#13   Object "/media/Summit/vg/vg", at 0x1cb73bd, in gomp_thread_start
#12   Object "/media/Summit/vg/vg", at 0xb020da, in main_map(int, char**) [clone ._omp_fn.1]
#11   Object "/media/Summit/vg/vg", at 0x115e7de, in vg::io::AlignmentEmitter::emit_mapped_single(std::vector<vg::Alignment, std::allocator<vg::Alignment> >&&)
#10   Object "/media/Summit/vg/vg", at 0x115f504, in vg::io::GafAlignmentEmitter::emit_mapped_singles(std::vector<std::vector<vg::Alignment, std::allocator<vg::Alignment> >, std::allocator<std::vector<vg::Alignment, std::allocator<vg::Alignment> > > >&&)
#9    Object "/media/Summit//vg/vg", at 0x116bb6a, in vg::io::alignment_to_gaf(handlegraph::HandleGraph const&, vg::Alignment const&, bool, bool, bool)
#8    Object "/media/Summit/vg/vg", at 0x1169ff2, in vg::io::alignment_to_gaf(std::function<unsigned long (long long)>, std::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (long long, bool)>, vg::Alignment const&, bool, bool, bool)
#7    Object "/media/Summit/vg/vg", at 0x11635c9, in std::_Function_handler<unsigned long (long long), vg::io::alignment_to_gaf(handlegraph::HandleGraph const&, vg::Alignment const&, bool, bool, bool)::{lambda(long long)#1}>::_M_invoke(std::_Any_data const&, long long&&)
#6    Object "/media/Summit/vg/vg", at 0x53b55e, in xg::XG::get_handle(long long const&, bool) const [clone .cold]
#5    Object "/media/Summit/vg/vg", at 0x1c06118, in __cxa_throw
#4    Object "/media/Summit/vg/vg", at 0x1c05fb6, in std::terminate()
#3    Object "/media/Summit/vg/vg", at 0x1c05f4b, in __cxxabiv1::__terminate(void (*)())
#2    Object "/media/Summit/vg/vg", at 0x558ca2, in __gnu_cxx::__verbose_terminate_handler() [clone .cold]
#1    Object "/media/Summit/vg/vg", at 0x55b733, in abort
#0    Object "/media/Summit/vg/vg", at 0x11fa66b, in raise