biothings / mygene.info

MyGene.info: A BioThings API for gene annotations
http://mygene.info
Other
113 stars 20 forks source link

POST searching complex fields with uppcase field names #35

Closed greg-k-taylor closed 6 years ago

greg-k-taylor commented 6 years ago

Consider the following biothings_client code snippet:

import biothings_client client = biothings_client.get_client('gene') qr = client.querymany(['P24941'], scopes='uniprot.Swiss-Prot', fields='uniprot.Swiss-Prot', as_generator=True, returnall=True) print(qr)

No hits are returned: {'out': [{'query': 'P24941', 'notfound': True}], 'dup': [], 'missing': ['P24941']}

However, using the biothings_client.query API call hits are returned.

There appears to be a problem in using scopes with complex fields and uppercase.

greg-k-taylor commented 6 years ago

The code below is a better representation of the problem. A hit is returned with the biothings_client query function but not with the biothings_client query_many function.

import biothings_client

client = biothings_client.get_client('gene') qr = client.querymany(['P19012']) print(qr) 1 input query terms found no hit: ['P19012']

qr = client.query('P19012') print(qr) {'max_score': 13.943995, 'took': 71, 'total': 1, 'hits': [{'_id': '3866', '_score': 13.943995, 'entrezgene': 3866, 'name': 'keratin 15', 'symbol': 'KRT15', 'taxid': 9606}]}

cyrus0824 commented 6 years ago

For the second post, no 'scopes' automatically searches against the _id field (for mygene mostly either entrez gene id or ensembl gene id), so I wouldn't expect any results.

For the first post, there was a bug in our handling of the 'scopes' field for certain nested fields, like uniprot.Swiss-Prot (and some others). A recent server fix was applied and the code now works as it should:

In [1]: from biothings_client import get_client

In [2]: mg = get_client("gene")

In [3]: mg.querymany(["P24941"], scopes='uniprot.Swiss-Prot', fields='uniprot.Swiss-Prot')
querying 1-1...done.
Finished.
Out[3]:
[{'_id': '1017',
  '_score': 15.062099,
  'query': 'P24941',
  'uniprot': {'Swiss-Prot': 'P24941'}}]