broadinstitute / gatk

Official code repository for GATK versions 4 and up
https://software.broadinstitute.org/gatk
Other
1.7k stars 591 forks source link

Feature Request - extract Funcotator VCF INFO 'sub-fields' #7556

Open GATKSupportTeam opened 2 years ago

GATKSupportTeam commented 2 years ago

A user on the GATK Forum submitted a request to make the INFO field easier to manipulate through creating a table. At the GATK Office Hours meeting 11/8, we discussed the two ideas and favored the first idea to make a new tool, similar to VariantsToTable, that would unpack the INFO field.

This request was created from a contribution made by Shahryar Alavi on October 30, 2020 19:54 UTC.

Link: https://gatk.broadinstitute.org/hc/en-us/community/posts/360073983291-VariantsToTable-not-extracting-INFO-sub-fields-#community_comment_360013343072

--

But MAF output is somewhat different from VCF; and I think the VCF output format is better for germline variant annotation.

With Funcotator we get an integrated (and minuter) "variant calling - annotation" workflow. But the problem is "vertical bar" separated INFOs are not easy for downstream text processing.

I have two suggestions for the GATK Team:

You may want to develop a new tool (like VariantsToTable) to separate each "sub-info" in the FUNCOTATION INFO, and put them into separate columns with corresponding headers when creating the tab-delimited table.

Or add a feature to Funcotator to create multiple INFOs with FUNCOTATION prefix in their IDs; e.g.

INFO=

INFO=

instead of

INFO=

Thanks

(created from Zendesk ticket #45403)
gz#45403

keenhl commented 1 month ago

Did this feature ever get added to gatk ? Thanks.