Closed mike-w-wilson closed 5 months ago
@ch-kr , thank you! I've moved the transcript amplification to high. I did not move the other two into high based on your comment. For things that are now in modifier, our code base uses CSQ_NON_CODING
to describe this section but these new members seem to contradict that. Thoughts on keeping these out for now? Seems like we need to decide if we want to imitate VEP rankings completely or give ourselves some room for adjustment?
yeah, I agree we need to decide whether to imitate VEP or just loosely follow their mapping with some adjustments. I was initially thinking we'd stay consistent with VEP, but I don't think that actually serves us in our potential downstream applications using these groups, so I vote we do the latter (use all of VEP's listed consequence terms and adjust the associated impacts where needed).
maybe we should move start_lost
and transcript_amplification
back to medium, keep feature_elongation
/feature_truncation
/coding_sequence_variant
where they are currently (non-coding for the first two and low for the last one), add coding_transcript_variant
to CSQ_CODING_LOW_IMPACT
, and add sequence_variant
to CSQ_NON_CODING
?
^I know this is more complex than the initial PR, so I'd be happy to merge this PR and start the discussion via slack to finalize which terms should go where
@ch-kr That sounds good to me. I've made updates to the PR. Would you still like to start the slack discussion?
let's merge -- I have a meeting with Kaitlin next week and will plan to ask her about these then (and move into a larger public channel if needed)
Adds: splice_donor_5th_base_variant, splice_donor_region_variant , and splice_polypyrimidine_tract_variant