biothings / mygene.info

MyGene.info: A BioThings API for gene annotations
http://mygene.info
Other
113 stars 20 forks source link

problems querying with POST (all ids with "notfound": true) #34

Closed emepyc closed 3 years ago

emepyc commented 6 years ago

I'm trying to get the symbols from a bunch of uniprot IDs, but I'm getting "notfound": true on all POST queries (using uniprot or any other ids).

For example, for me this works:

curl 'http://mygene.info/v3/query?q=TP53'
{
  "max_score": 448.4826,
  "took": 120,
  "total": 2973,
  "hits": [
    {
      "_id": "7157",
      "_score": 448.4826,
      "entrezgene": 7157,
      "name": "tumor protein p53",
      "symbol": "TP53",
      "taxid": 9606
    },
   ...

but this doesn't:

curl -XPOST -d 'q=TP53' -H "Content-Type: application/x-www-form-urlencoded" 'http://mygene.info/v3/query'
[
  {
    "query": "TP53",
    "notfound": true
  }
]

What would be the correct way of making these POST queries?

emepyc commented 6 years ago

Ok, explicitly including the scopes works...

curl -XPOST -d 'q=P53&scopes=all' -H "Content-Type: application/x-www-form-urlencoded" 'http://mygene.info/v3/query'

But that is not obvious from the documentation IMHO: http://mygene.info/tryapi/

And here says that scopes are optional (at least for the python client): http://docs.mygene.info/en/latest/doc/query_service.html#scopes

namespacestd0 commented 3 years ago

By default, on mygene.info, POST queries uses scopes = [ "_id", "entrezgene", "ensembl.gene", "retired" ], which explains your findings. Having explicit scopes is intended to provide the most relevant results for programmatic retrieval, especially when the user is trying to lookup by a specific type of id, where GET requests are somewhat tailored towards data exploration, thus by default search a wider range of ids and support complex query string parsing.