cio-abcd / variantinterpretation

Collaborative Interpretation-Pipeline workflow based on nf-core pipeline structure
MIT License
7 stars 1 forks source link

VCF conversion to easily processable format as TSV #7

Closed sci-kai closed 1 year ago

sci-kai commented 1 year ago

Description of feature

The annotated output from VEP should be converted to a format more easy to process for spreadsheet programs and programming languages as python and R. A common standard is a TSV file format. This should be implemented as separate module. BCFtools already has a plugin for splitting VEP-annotated output "+split-vep" https://samtools.github.io/bcftools/howtos/plugin.split-vep.html. It should output all columns by default.

biolancer commented 1 year ago

Based on the status report meeting from April, an alternative approach to generate a TSV output file using "vembrane" (https://github.com/vembrane/vembrane) was discussed.

Initial local tests in Aachen with the software package showed no silent dropping of mutations without consequence information using vembrane, a problem linked to vep-split. To integrate vembrane, a few adaptions would need to be done to the workflow: