monarch-initiative / monarch-legacy

Monarch web application and API
BSD 3-Clause "New" or "Revised" License
42 stars 37 forks source link

Standard behavior for unknown identifiers (incl. orphanet) #622

Open cmungall opened 9 years ago

cmungall commented 9 years ago

Occasionally a user will end up on a page for a disease, phenotype, etc that our SciGraph instance knows nothing about, for a variety of reasons. For example, currently we don't have orphanet loaded in the current SG instance.

In these cases, the page should not 500. We should have a generic "ID not known" page. We can also provide a link to the relevant external page with verbiage like "try searching on URL". This would use a generic ID->URL expansion mechanism.

cc @nlwashington @kltm

kshefchek commented 9 years ago

It looks like Orphanet is in scigraph: http://geoffrey.crbs.ucsd.edu:9000/scigraph/graph/neighbors/ORPHANET:100.json This may have just been a bug in my null/undefined checking, this now works: http://tartini.crbs.ucsd.edu/disease/ORPHANET:100

cmungall commented 9 years ago

May only be for a subset (those mapped to DO)

On 4 Dec 2014, at 14:13, Kent Shefchek wrote:

It looks like Orphanet is in scigraph: http://geoffrey.crbs.ucsd.edu:9000/scigraph/graph/neighbors/ORPHANET:100.json This may have just been a bug in my null/undefined checking, this now works: http://tartini.crbs.ucsd.edu/disease/ORPHANET:100


Reply to this email directly or view it on GitHub: https://github.com/monarch-initiative/monarch-app/issues/622#issuecomment-65713417

kshefchek commented 9 years ago

Do we have an example ID to test or know where this originally failed?

nlwashington commented 9 years ago

some of the ones that failed, but now seem ok are: http://tartini.crbs.ucsd.edu/disease/ORPHANET:85174 (although, it has no data on it, but since this was referenced from http://stage.monarchinitiative.org/phenotype/HP:0010669 it should have at least one phenotype)

similarly, http://tartini.crbs.ucsd.edu/disease/ORPHANET:98880 (seems to be missing the phenotypes, but was referenced on http://stage.monarchinitiative.org/phenotype/MP:0001914)

this still has a stacktrace (the cors one): http://tartini.crbs.ucsd.edu/variant/ZFIN:ZDB-ALT-120411-225

(see #448 for others with 500 errors)

nlwashington commented 9 years ago

also http://tartini.crbs.ucsd.edu/disease/KEGG:ds-H00078.

KEGG ids are definitely not in the disease ontology yet, and thus not yet in SciGraph.
Also, I've had to do some funkiness to the kegg identifiers anyway, so they might need to be processed specially here. KEGG original identifiers (to get into their own website) are like: ds:H00078. for us to deal with them, i've reformatted them by replacing the existing colon with a dash, and prefixed with KEGG. this is terrible, i know, but what i had to do to get them to index properly in the NIF system. (we should think of a better strategy going forward). anyway, this one resolves on tartini now.

kshefchek commented 9 years ago

Would we expect ORPHANET:98880 to generate any phenotype data if we can't get its equivalency with SciGraph?

cmungall commented 9 years ago

It's possible

jmcmurry commented 9 years ago

All manner of wacky ID syntax could be pasted by users.

In the elusive "fullness of time", it might be great if we could ping another service to fetch all possible known variations on a database prefix. Eg. HPO vs HP. That should be broken out as a separate ticket that we may never get to.