Ensembl / VEP_plugins

Plugins for the Ensembl Variant Effect Predictor (VEP)
Apache License 2.0
132 stars 114 forks source link

GWAS plug in Assumptions related to SNPS column #684

Closed Bobsimonoff closed 3 months ago

Bobsimonoff commented 5 months ago

I think the regex in the plugin assumes the values in the SNPS column are of the form: rs[0-9]+

However not all are. By running: cut -d $'\t' -f 22 gwas_catalog_v1.0.2-associations_e110_r2023-12-20.tsv| grep -v -E 'rs[0-9]+'

We see a wide variety of values in this column, including ones that follow these formats to list a few: chr12:59581708 chr19:19393890:I chr19:19393890:D exm474728 kgp21281797 HLA-A*02:01 X:146986184:A_AAA 7:120812727_G_C

This results in almost 40k warnings on the current version of the file of the following form: WARNING: Could not parse any rsIds from string 'chrX:66510909'

I am not sure if anything can be done plugin wise about this, but in case there is I thought I'd report it

nakib103 commented 5 months ago

Hello @Bobsimonoff,

Thanks for your query and providing feedback on GWAS plugin!

Yes, currently the plugin filters out variants that does not have variant accession ids. In future, we can look into annotating other variant entries that does not have rs ids but at least have risk allele.

Nonetheless, the warning message seems to be many and it does not looks right. I will add a quiet option to optionally turn off the warning. In the meantime, you can just comment out the line that is giving the warning.

Best regards, Nakib

Bobsimonoff commented 5 months ago

Thanks will watch for the update, but if performance can't be improved dramatically, I may just do the annotation in python using multithreading since I can process the entire GWAS in about 1-2 hours that way.

nakib103 commented 3 months ago

Hello @Bobsimonoff,

We have added a option verbose to the plugin. If verbose=1 you will see the warning messages otherwise not, as the plugin can be quite noisy.

This update will be available in the next Ensembl release 112. I will close this issue. If you face further problem feel free to open a new one.

Best regards, Nakib