biothings / mygene.info

MyGene.info: A BioThings API for gene annotations
http://mygene.info
Other
113 stars 20 forks source link

Restrict query to protein-coding genes #24

Closed dhimmel closed 6 years ago

dhimmel commented 6 years ago

We'd like a way to restrict our query to entrez genes where type_of_gene="protein-coding". Essentially we want to query for human protein-coding entrez genes. The closest could get is:

https://mygene.info/v3/query?q=TP53&fields=symbol%5E2%2Calias%2Ctype_of_gene&species=human&size=2&facets=type_of_gene&entrezonly=true

which returned:

{
  "facets": {
    "type_of_gene": {
      "terms": [
        {
          "term": "protein-coding",
          "count": 29
        },
        {
          "term": "pseudo",
          "count": 6
        },
        {
          "term": "ncRNA",
          "count": 1
        }
      ],
      "_type": "terms",
      "total": 36,
      "missing": 0,
      "other": 0
    }
  },
  "max_score": 459.45786,
  "took": 23,
  "total": 43,
  "hits": [
    {
      "_id": "7157",
      "_score": 459.45786,
      "alias": [
        "BCC7",
        "LFS1",
        "P53",
        "TRP53"
      ],
      "type_of_gene": "protein-coding"
    },
    {
      "_id": "653550",
      "_score": 23.245737,
      "alias": [
        "TP53TG3",
        "TP53TG3E",
        "TP53TG3F"
      ],
      "type_of_gene": "protein-coding"
    }
  ]
}

So we could filter by type_of_gene after receiving the response. But is there a way to filter at query time?

newgene commented 6 years ago

@dhimmel yes, you can actually make boolean queries:

https://mygene.info/v3/query?q=TP53%20AND%20type_of_gene:protein-coding&fields=symbol%5E2%2Calias%2Ctype_of_gene&species=human&size=2&facets=type_of_gene&entrezonly=true

q=TP53 AND type_of_gene:protein-coding

Then you don't need facets anymore.

dhimmel commented 6 years ago

Thanks @newgene. Also confirming that order doesn't appear to matter, e.g. the following works:

https://mygene.info/v3/query?q=type_of_gene:protein-coding%20AND%20TP53&fields=symbol%5E2%2Calias%2Ctype_of_gene&species=human&size=2&facets=type_of_gene&entrezonly=true

newgene commented 6 years ago

Yes, the order does not matter, just like regular boolean syntax.