BrooksLabUCSC / flair

Full-Length Alternative Isoform analysis of RNA
Other
205 stars 71 forks source link

Feature request: add genes to flair diffSplice output #231

Open Jeltje opened 1 year ago

Jeltje commented 1 year ago

Currently the gene_id field in drimseq_<>_A_v_B.tsv contains coordinates:

feature_id      gene_id A1_A_b0 A2_A_b0 A3_A_b0 B1_B_b0 B2_B_b0 B3_B_b0 lr      adj_pvalue
exclusion_chr20:35556271        chr20:35556271-35556954_chr20:35556271-35556972 0.677   0.677   0.677   0.021   0.021   0.021   1601.9  0
inclusion_chr20:35556271        chr20:35556271-35556954_chr20:35556271-35556972 0.323   0.323   0.323   0.979   0.979   0.979   1601.9  0

This coordinate information is also available in the diffsplice.<>.events.quant.tsv files, which are already linked via their feature_id:

feature_id      coordinate      A1_A_b0 A2_A_b0 A3_A_b0 B1_B_b0 B2_B_b0 B3_B_b0 isoform_ids
inclusion_chr20:35556271        chr20:35556271-35556954_chr20:35556271-35556972 43.0    59.0    47.0    5900.0  5800.0  4700.0  ENST00000451605.1_ENSG00000125991.19
exclusion_chr20:35556271        chr20:35556271-35556954_chr20:35556271-35556972 104.0   102.0   110.0   131.0   111.0   110.0   ENST00000348547.6_ENSG00000125991.19,HISEQ:1287:HKCG7BCX3:1:1106:1276:65058_ENSG00000125991.19

It would be more useful if the gene ID was extracted from the .quant.tsv files and put into the drimseq output.