EBISPOT / goci

GWAS Catalog Ontology and Curation Infrastructure
Apache License 2.0
26 stars 19 forks source link

Mapping pipeline failure #1329

Closed ljwh2 closed 2 weeks ago

ljwh2 commented 1 month ago

Very few associations got mapped between 14 and 16 May. In the most recent data release only 14 of ~5000 rsIDs have mapping for publications published 14-16 May (affecting 9 publications). Since then associations seemed to get mapped ok. There is also a publication published a few days earlier (10th May - PMID 38671320), that has a lot of blank mappings (74/193 valid SNPs have missing mappings). From the emails I can see a new Ensembl version got released around 13/14 May, and there was also an Ensembl REST API error on 13th May, but that doesn’t seem to account for the earlier one.

@sajo to rerun the mapping pipeline for affected SNPs and implement additional logging to capture this.

Attached is a list of ~4K SNPs without mapping. I have checked these all are valid and have genomic location in Ensembl so if any return errors, please re-run them again.

Valid_SNPs_23_05_24.txt

sajo-ebi commented 1 month ago

@ljwh2 the mapping is finished for the rsiD mentioned in the list, I can't find any errors in mapping , but have added logic to retain the logging information for at least 60 days , which is the limit in Codon cluster to retain logs , in future if any rsid mapping is missed we can look at logs to determine the root cause for the rsId

ljwh2 commented 1 month ago

@ljwh2 to verify after DR

sajo-ebi commented 3 weeks ago

@sajo-ebi to create ticket to store the mapped variants response in history table & use that to fetch response instead of making an Ensembl call

sajo-ebi commented 2 weeks ago

https://github.com/EBISPOT/goci/issues/1348 Ticket for Dev changes