hartwigmedical / hmftools

Various algorithms for analysing genomics data
GNU General Public License v3.0
189 stars 58 forks source link

DEV-3317 add non-canonical transcripts data to orange #434

Closed KasperWolsink99 closed 1 year ago

KasperWolsink99 commented 1 year ago
KasperWolsink99 commented 1 year ago

@kduyvesteyn

Thank you for the review, to address your points:

  1. I think you are right and this is a very good idea, I have made some changes to reflect this.

  2. The reason for creating a separate PurpleVariantTranscriptImpact class was that I was under the assumption that a transcripts codingEffect is conceptionally different from its effects despite having similar names. From what I understood each non-canonical transcript has one effect listed in the purple VCF output but this isn't necessarily the codingEffect. Hence a new class had to be created because codingEffect is not nullable on PurpleTranscriptImpact. For now I deleted PurpleVariantTranscriptImpact and use PurpleTranscriptImpact instead with the assumption that the effect listed is also the coding effect. I might have to verify that this also makes sense in a bio-molecular context. As for why I did not use the otherImpacts field: I had a conversation with Lieke and it turns out this field has a special meaning for the CDKN2A gene. Changing this fields meaning might lead to problems down stream although I do agree that the naming is very confusing right now and the extra field seems unnecessary at first glance.

  3. This is true and I have noticed that in some classes the hmftools style is used and in other classes the pipeline5 style is used. I can reformat everything in accordance to the hmftools style but this might lead to a big diff. But that might be worth it such that all styles are consistent :).

kduyvesteyn commented 1 year ago

1) You now create PurpleVariant but don't populate it exactly. I would still create a separate datamodel.

2) The otherImpacts are to capture all impacts other than the canonical, so your new field is unnecessary.

3) Every transcript impact has N effects which are interpreted as 1 coding effect (see VariantImpactBuilder.determineCodingEffect).