vgteam / vg

tools for working with genome variation graphs
https://biostars.org/tag/vg/
Other
1.12k stars 194 forks source link

Feature Request: vg map --> output 'AS' tag option #2817

Open achon opened 4 years ago

achon commented 4 years ago

As the title, this is a feature request for vg map to have a flag to output alignment score, 'AS', tags. Currently, map -v option reports alignment scores thus hopefully this isn't too difficult to add.

Example use case command vg map -t 16 -M 10 -d wg -f test_1k_1.fq -f test_1k_2.fq --surject-to bam > test_1k.bam

The idea being that I would like to use the 'AS' tag downstream for further filtering reads not just based on MAPQ.

ekg commented 4 years ago

The alignment score is calculated and included in GAM, JSON, and (I think) GAF output. It'd be trivial to add, maybe a few lines of code. Do you write C++ enough to do that?

On Tue, May 26, 2020, 20:23 Alvin Chon notifications@github.com wrote:

As the title, this is a feature request for vg map to have a flag to output alignment score, 'AS', tags. Currently, map -v option reports alignment scores thus hopefully this isn't too difficult to add.

Example use case command vg map -t 16 -M 10 -d wg -f test_1k_1.fq -f test_1k_2.fq --surject-to bam

test_1k.bam

The idea being that I would like to use the 'AS' tag downstream for further filtering reads not just based on MAPQ.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/vgteam/vg/issues/2817, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQELGYO6HSFJL4UL5HP3RTQCJBANCNFSM4NK3PFEQ .

achon commented 4 years ago

Unfortunately I don't know C++ that well. I've taken a look at src/subcommand/map_main.cpp and src/subcommand/surject_main.cpp, but the changes aren't obvious to me.

ekg commented 4 years ago

It wouldn't be there, but in the alignment emitter class. Look for where the SAM/BAM records are made.

On Tue, May 26, 2020, 20:53 Alvin Chon notifications@github.com wrote:

Unfortunately I don't know C++ that well. I've taken a look at src/subcommand/map_main.cpp and src/subcommand/surject_main.cpp, but the changes aren't obvious to me.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/vgteam/vg/issues/2817#issuecomment-634211419, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQEK25Y63DTTFVTVBW6TRTQFZJANCNFSM4NK3PFEQ .

achon commented 4 years ago

I see it. I also looked into alignment.cpp for it as well. Honestly, I wouldn't know where to start and I wasn't planning on spending time to learn C++, the htslib, and VG to do it. I'd really like to use VG for my downstream applications, but I need the AS tag. Additionally, my dataset is too large to use json and I can't find a GAM to BAM converter that retains the AS tag.

jeizenga commented 4 years ago

@ekg For SAM output this would be pretty straightforward to add, but I think it might be a tall order for a less-experienced C++ programmer on the BAM output. You would have to hack it into HTSLib's bam_1t struct, which uses some rather optimized C code and also isn't documented particularly well anywhere (AFAIK).