cio-abcd / variantinterpretation

Collaborative Interpretation-Pipeline workflow based on nf-core pipeline structure
MIT License
7 stars 1 forks source link

Change TSV conversion to one-variant-per-row and display one feature on front #38

Open sci-kai opened 6 months ago

sci-kai commented 6 months ago

Description of feature

Currently, the TSV conversion produces TSV files having each feature annotation of a variant (e.g., a transcript) as separate row. This produces a TSV file in which a specific variant can be reported multiple times, e.g., if multiple transcripts are reported or a transcript and a regulatory region are affected by this variant. With the current transcript filtering, e.g. by the PICK flag, a large number of variants are reported with multiple feature annotation.

Hence, I suggest to change the TSV conversion to produce only one row per variant. Further, the transcript filtering process should be revised to prioritze one "major" feature annotation to be displayed on front and show alternative feature annotations as separate columns.

This makes the report smaller especially if a large list of transcripts is kept and also more clear, as it summarizes all consequences of the variant that otherwise could get lost due to specific filter settings.

This requires substantial changes probably associated with a new major release.