biothings / mydisease.info

7 stars 8 forks source link

improve handling of colons in query search terms #6

Closed andrewsu closed 3 years ago

andrewsu commented 4 years ago

This record http://mydisease.info/v1/disease/MONDO:0011613 links to DOID:0060369 through mondo.xrefs.doid. However, the query service requires escaping the colon:

mygene.info somehow manages to handle the colons better for GO annotations:

In an ideal world, each pair of queries would be parsed the same. However, I recognize this gets a little odd with fielded queries:

newgene commented 4 years ago

@andrewsu In general, the : in the query term will need to be escaped, otherwise, it will be interpreted as the fielded query and may cause error if it does not parse correctly.

In MyGene.info case, ?q=GO:0000082 works because it is was specifically handled for GO: in the query term (added the escaping behind the scene). Although, that specific 500 error on your 2nd MyGene.info link needs to be investigated.

So basically we will need to handle the the individual prefixes (DOID, MONDO, etc) specifically.

andrewsu commented 4 years ago

makes sense. I wonder if that feature is generic enough to make part of the SDK? Perhaps a config file could list the CURIE prefixes that should be escaped in search queries?

namespacestd0 commented 3 years ago

@newgene http://mygene.info/v3/query?q=GO:%5C0000082 no longer throws a 500 error. @andrewsu it is a generic feature in biothings.web, https://github.com/biothings/biothings.api/blob/f4cd5880033de12e57fe6fa09ac5388802888508/biothings/web/options/manager.py#L114 is where it is processed. See these translations settings in mygene for example: https://github.com/biothings/mygene.info/blob/e94ea4597319c259a0c31e799be267d560d79f22/src/config_web.py#L143