Ensembl / VEP_plugins

Plugins for the Ensembl Variant Effect Predictor (VEP)
Apache License 2.0
132 stars 114 forks source link

SpliceAI scores #692

Closed jml96 closed 2 months ago

jml96 commented 4 months ago

Hi, The scores obtained by running spliceai script and the in https://spliceailookup.broadinstitute.org/ are the same. However, the scores obtained using spliceAI plugin for VEP and the file spliceai_scores.raw.snv.hg38.vcf.gz, downloaded with Illumina basespace, are different. I cannot find the reason for these differences. Thank you.

dglemos commented 4 months ago

Hi @jml96, When you say running the "spliceai script", do you mean the vep plugin SpliceAI? What command do you run with SpliceAI and which file? In Illumina basespace there are two types of files: raw and masked. Are you comparing the different results using always the raw scores?

dglemos commented 4 months ago

Can you please post your commands here? The command you use to run the VEP plugin and the SpliceAI script.

jml96 commented 4 months ago

Hi @dglemos, I am running the spliceai script according to the instructions in https://github.com/Illumina/SpliceAI. I assume that I am running it correctly because I am getting the same spliceai scores as the ones in the website (https://spliceailookup.broadinstitute.org/). However, these scores differ from the ones in the spliceai_scores.raw.snv.hg38.vcf.gz file required to run the spliceai plugin for vep (--plugin SpliceAI,snv=...). I am following the instructions provided here: https://github.com/Ensembl/VEP_plugins/blob/release/111/SpliceAI.pm.

Thank you.

dglemos commented 4 months ago

Can you please provide an example of a variant with different scores?

jml96 commented 4 months ago

From spliceai website: https://spliceailookup.broadinstitute.org/#variant=NM_001005484.2%3Ac.68T%3EG&hg=38&distance=4999&mask=0&ra=0

from spliceai script:

CHROM POS ID REF ALT QUAL FILTER INFO

1 69095 NM_001005484.2_c.68T>G T G . PASS SpliceAI=G|RefSeqTx-NM_001005484.2|0.00|0.00|0.00|0.02|-48|-58|-59|-2

From spliceai_scores.raw.snv.hg38.vcf.gz and consequently from vep: 1 69095 . T G . . SpliceAI=G|OR4F5|0.00|0.05|0.01|0.40|47|38|20|-2

dglemos commented 4 months ago

Thanks for sending the example.

Checking the SpliceAI file downloaded from Illumina Basespace in October 2019, these are the scores for your variant:

> tabix spliceai_scores.raw.snv.hg38.vcf.gz 1:69095-69095
1   69095   .   T   A   .   .   SpliceAI=A|OR4F5|0.00|0.04|0.02|0.40|47|38|20|-2
1   69095   .   T   C   .   .   SpliceAI=C|OR4F5|0.00|0.04|0.02|0.40|47|38|20|-2
1   69095   .   T   G   .   .   SpliceAI=G|OR4F5|0.00|0.05|0.01|0.40|47|38|20|-2

The VEP plugin only attaches the scores to the VEP output. You should contact the team responsible for https://spliceailookup.broadinstitute.org/ to understand if their website is using the same scores as the file spliceai_scores.raw.snv.hg38.vcf.gz

jml96 commented 4 months ago

Thank you for the suggestion.

dglemos commented 2 months ago

I'm going to close this issue, please feel free to re-open it if you have further questions.

Best wishes, Diana