opentargets / issues

Issue tracker for Open Targets Platform and Open Targets Genetics Portal
https://platform.opentargets.org https://genetics.opentargets.org
Apache License 2.0
12 stars 2 forks source link

Ontology review: amyotrophic lateral sclerosis ontology mapping from data providers #118

Closed ElaineMcA closed 5 years ago

ElaineMcA commented 6 years ago

@gkos-bio commented on Wed Feb 08 2017

none of the C9orf72 clinvar reports (https://www.ncbi.nlm.nih.gov/clinvar/?term=C9orf72%5Bgene%5D) show up in our data

On our platform, https://www.targetvalidation.org/disease/Orphanet_803 , gives an empty page, but we actually have evidence data for Orphanet_803 (in the 16.12 & 17.02 release)

On zooma, ALS was give a EFO accession EFO_0000253 while there is an orphanet id for it.

in our json string, we have evidence associated to orphanet_803. But it is not appearing on our platform, evidence mapping to EFO_0000253 are on the platform. But looking at the below, should orphanet_803, EFO_0000253 be replaced by their parent term?

http://www.ebi.ac.uk/ols/ontologies/efo/terms/graph?iri=http://www.ebi.ac.uk/efo/EFO_0001357 http://www.ebi.ac.uk/ols/ontologies/efo/terms/graph?iri=http://www.ebi.ac.uk/efo/EFO_0001357

x-ref to the ORDO IRI in the future (just for bioinformatics convenience) and ask EVA to map using the 2 terms in EFO.


@gkos-bio commented on Mon Feb 13 2017

most of the variants in the C9orf72 gene are not pathogenic so are not integrated in our system. Only two of the short mutations in Clinvar (rather than large deletions) are annotated as pathogenic https://www.ncbi.nlm.nih.gov/clinvar/variation/183034/ https://www.ncbi.nlm.nih.gov/clinvar/variation/31151/ We should be receiving these two.


@ckongEbi commented on Tue Feb 14 2017

@gkos-bio added/moved data-related issues to "data-providers-docs" repo https://github.com/opentargets/data-providers-docs/issues/10


@ckongEbi commented on Tue Jul 10 2018

@gkos-bio have we highlight this to SPOT or need following up?


@ElaineMcA commented on Tue Sep 11 2018

Need to follow-up with EVA.

afaulconbridge commented 6 years ago

Is this an accurate summary of the current status of this issue?

We expect to see https://www.ncbi.nlm.nih.gov/clinvar/variation/183034/ and https://www.ncbi.nlm.nih.gov/clinvar/variation/31151/ at https://www.targetvalidation.org/evidence/ENSG00000147894/EFO_0000253 but it isn't present. The action is to discuss with the EVA curators to work out where its being dropped from the pipelines between ClinVar and OpenTargets.

ElaineMcA commented 6 years ago

Feedback from Cristina @ EVA received on where this term is being lost from pipeline: I have searched all the files from submissions during the past 2 years and haven't found those records in any of them.

The only good result returned by OLS (one of the tools we use for automated mapping) belong to OMIM, not to EFO/ORDO/HPO. See https://www.ebi.ac.uk/ols/search?q=Amyotrophic+lateral+sclerosis+and%2For+frontotemporal+dementia+1

That is most probably the stage at which the term was filtered out.

iandunham commented 6 years ago

Does this answer explain the problem? I'm not sure. May require an in depth look

afaulconbridge commented 6 years ago

Looks like ClinVar uses OMIM terms, but in this case OMIM:105550 can be mapped to ORDO e.g. via OXO https://www.ebi.ac.uk/spot/oxo/terms/OMIM:105550 - but not by direct text matching.

It does also exist in Mondo (https://www.ebi.ac.uk/ols/ontologies/MONDO/terms?iri=http://purl.obolibrary.org/obo/MONDO_0007105) so maybe this is something to revisit once we're using EFO3 fully. It is probably good to revisit the mapping pipelines the data providers are running at that point too, to check they are consistent and up-to-date.

AsierGonzalez commented 5 years ago

This issue is actually a combination of two separate problems, missing variants on the one hand and ontology mapping on the other:

AsierGonzalez commented 5 years ago

Closing this ticket as there is nothing to be done for now. If those two variants are considered by the pipeline in the future we will need to monitor the disease term they are annotated with.