vgteam / vg

tools for working with genome variation graphs
https://biostars.org/tag/vg/
Other
1.12k stars 194 forks source link

Banded alignment doesn't produce qualities, sequence names or alignment scores #191

Closed edawson closed 7 years ago

edawson commented 8 years ago

When aligning long reads (~8kb) I found that the output wasn't producing qual scores or sequence names.
vg map -B 200 -k 8 -f t.fq -g hpv.gcsa -x hpv.xg -J

Erik figured out that it's because these are not grabbed from the input alignments during banded mapping. Mapping with a band size greater than read length yields quals: vg map -B 20000 -k 8 -f t.fq -g hpv.gcsa -x hpv.xg

I've since modified the mapper code to steal the qualities and sequence name from the input and tack it on the output (in the banded mapping function of mapper.hpp). I'll put the modified code in a new pull request soon.

However, there's still not a good concept of score for banded alignments. Erik suggested we score based on the path and the parameters used for the dynamic programming bit.

adamnovak commented 7 years ago

Banded alignment has been redone; I'm going to assume it fixed this issue.