lh3 / minigraph

Sequence-to-graph mapper and graph generator
https://lh3.github.io/minigraph
MIT License
419 stars 38 forks source link

read mapping to sam? #36

Open cistarsa opened 3 years ago

cistarsa commented 3 years ago

Hello

Thank you for developing these fantastic tools!

I'm attempting to map short paired-end reads to a structural graph and call variants, similar to VG but I prefer how I can determine the sequence paths for bubbles etc. in your program, and I'm wondering if you've developed a way to produce a sam file for these mapped reads?

I was also looking to contrast the number of reads that mapped to the structural graph with the linear reference, using minimap2, and I'm seeing a lot more read pairs mapping to the linear reference than the graph, as well as reads mapping to different regions of the same reference. I'm curious why this may be.

minigraph -xsr -t16 all5.gfa $R1 $R2 > r12_all5.gaf
# with 13,527,026 reads mapped

# minimap2:
minnimap2 -xsr Ref.fasta $R1 $R2 > r12_single.paf
# with 27,968,534 reads mapped

same reads mapping to different scaffolds:


$grep "HWI-D00256:413:C7N5GANXX:3:1101:2734:2086" CPB*20*paf

HWI-D00256:413:C7N5GANXX:3:1101:2734:2086   126 5   62  +   F_KS_tig00081383_RagTag_RagTag  19065987    1623340 1623397 42  57  12  tp:A:P  cm:i:2  s1:i:42 s2:i:0  rl:i:0
HWI-D00256:413:C7N5GANXX:3:1101:2734:2086   126 9   96  +   F_KS_tig00027608_RagTag_RagTag  3676404 2027258 2027345 67  87  0   tp:A:P  cm:i:5  s1:i:67 s2:i:67 rl:i:0

$ grep "HWI-D00256:413:C7N5GANXX:3:1101:2734:2086" CPB*20*gaf

HWI-D00256:413:C7N5GANXX:3:1101:2734:2086   252 5   96  +   >LI_2018_scaffold1025_size140985:6799-7281  482 89  180 84  91  60  tp:A:P  cm:i:7  s1:i:84 s2:i:0  dv:f:0.0295 ql:B:i,126,126
HWI-D00256:413:C7N5GANXX:3:1101:2734:2086   252 156 243 -   F_KS_tig00027608_RagTag_RagTag  3676404 1894488 1894575 67  87  0   tp:A:P  cm:i:6  s1:i:67 s2:i:67 dv:f:0.0368 ql:B:i,126,126
niemasd commented 2 years ago

+1, SAM output would be very much appreciated!