broadinstitute / SpliceAI-lookup

Website for checking SpliceAI and Pangolin scores:
https://spliceailookup.broadinstitute.org
MIT License
18 stars 7 forks source link

Is masking working for this variant? #59

Closed DELAMHACH closed 3 months ago

DELAMHACH commented 10 months ago

Hi -

For the variant at 17-41256878-C-T, with or without masking, and with the Max distance >=95, SpliceAI is showing an Acceptor Gain | 0.21 | 95 bp. 95bp is the distance to the nearest native acceptor site: https://grch37.ensembl.org/Homo_sapiens/Location/View?db=core;g=ENSG00000012048;r=17:41256878-41256974

I am struggling to understand if this is a true gain of strength at the native acceptor or if masking is failing in this instance.

I have not yet found a second variant with this issue and have tested at least one other on a different exon without issue. However Pangolin is not showing such a gain.

VARIANT: https://spliceailookup.broadinstitute.org/#variant=17-41256878-C-T&hg=37&distance=500&mask=1&ra=0

Screenshot 2023-11-16 at 5 42 21 PM

Mostly highlighting to make sure it is not a systematic issue.

bw2 commented 10 months ago

I think the donor loss score is being masked correctly because the +16bp position is not at an annotated splice site. The masking behavior of the donor/acceptor gain scores seems inconsistent, the same as in https://github.com/broadinstitute/SpliceAI-lookup/issues/58

image

Either both of them should be masked or neither of them. This appears to be a common bug for spliceai gain scores whose positions fall on the last base of an exon. I haven't yet tracked it down in the code.