Closed MattWellie closed 1 year ago
How to implement:
I quite like the idea of running both using different thresholds. Eg. anything above 0.5 would be category_5
and anything between 0.2-0.5 could be support
. Though perhaps tweaking those thresholds if we need to.
Considerations: I'm not sure that I know enough context to fully understand your second point. Splicing will be a possible MOP for any gene because a splicing variant may result in a frameshift, an insertion of a PTC, and/or in-frame indels with similar outcomes to missense changes. You're right though, that you cannot be sure which of these impact/s will result from a splicing variant without performing functional studies first. I have a few examples that I can show you where functional studies have revealed splicing variants that result in multiple different outcomes with the impact being incredibly difficult to interpret. So for that reason, I don't think the AIP should try to assign an impact but I think the result should still be reported so it can be followed up. Eg. my previous group, run by Sandra Cooper, invite clinicians and scientists to submit candidate splicing variants to aid in the interpretation of these variants and they are getting better at predicting the impact. Also, with enough other lines of evidence you can sometimes get these splicing variants over the line without functional studies, so reporting regardless I think is useful.
Thresholds From experience I want to say 0.2, or even 0.1 would be a low threshold, with 0.5-0.8 as a high threshold. Though I feel like you'd miss a lot of variants with a threshold of 0.8. From memory I think 0.2, 0.5, and 0.8 are the thresholds recommended in the original paper which is why everyone generally refers to one of those three? Happy to help go through example datasets and see how many variants they return.
Re-Working Not entirely sure I understand all the steps currently involved here. What do you mean about "run annotation of splice category assignment for each variant"?
For an initial implementation I suggest we add a distinct test category and start with an initial conservative configurable threshold of say 0.5. I also like the idea of adding a more permissive sAI threshold to the Support
category but that should probably be extracted into a separate conversation/evaluation. For the moment we should focus on the identification of very high signal-to-noise predictions that are likely to result in LOF.
Depending on what we see when we run this at scale we may need to investigate some additional transcript-based filtering (ie do we need to limit this category to a more strictly defined set of transcripts to control noise?) and/or complement it with predicted consequence filters but lets deal with that if and when we see the problem.
@SamBryen Not very well explained on my part; I was dumping thoughts without fully explaining!
WRT #2, PanelApp results have a slot where a specific MOI can be documented. I was thinking there may be a situation where the MOI for a specific disease could be Loss-of-function doesn't cause this
, but splice variants could equally cause activation or LoF so that's probably not a fruitful discussion...
WRT #4, this is slightly more technical, so happy to discuss in person. The current algorithm does a lot of progressive filtering, removing all variants without high impact consequences, removing all annotations not on 'Green' genes, etc., and only then positively assigns categories in a smaller search space. That isn't compatible with splice variants, as we're not limited to exonic, or even genic (depending on how good spliceAI is at finding promoter/enhancer sites?). The whole flow will need alteration to retain variants with these new types of consequence. It's all well and good missing out pathogenic variants because they don't pass the tests, but it looks pretty rubbish to have missed them entirely because they were blindly filtered out 🙃
One remedy would be to run splice consequence testing earlier, before those genic/consequence filters, then when filtering all downstream tests change from keep everything with high VEP consequences
to keep high VEP consequences or high SpliceAI
. This point was thrown in to flag that this won't actually be a simple modification 😬.
Ah right that makes sense. Yes you would want to run it before the consequence filters to grab synonymous changes and extended splice site variants.
Ok! there's a draft currently on #50
SpliceAI annotations as currently applied consist of a String consequence with the highest Delta (if any, so values could be No Consequence
, or Acceptor/Donor Gain/Loss
), and the corresponding Delta scores. Threshold values here relate to the Delta score annotation.
Changes (that I can remember off hand):
spliceai_full
, spliceai_support
) which hold float values for the spliceAI Delta. Initially both 0.5
spliceai_support
is a place holder for a lower threshold contributing to the support category, but that's not yet implementedCategory 1
explicitly tests that clinvar annotation doesn't contain ~Benign
. This means that de novo/splice variants which are Clinvar benign could still be flagged... I think this is defensible as each category should be based on completely independent information, but happy to re-visit.
We would like to include splice prediction as one of the facets when determining if a variant is plausibly relevant to diagnosis. This will be a little more complicated than just adding a new category, as the current filters will remove variants without high-consequence impacts annotated against them. Discussion:
How to implement
Support
category labelling. This would treat the annotations like all the other in silico tools, and passing the threshold test would mean that variants could support other 'full category' variants in compound het. formations.category_5
. This would be a category whose sole purpose is to highlight variants with compelling in silico splice prediction results. This would be treated as a full category, i.e. het. variants with this category assigned could make it into the report.Considerations
Thresholds
Re-Working
or
spliceAI flag assigned