monarch-initiative / biolink-api

API for linked biological knowledge
https://api.monarchinitiative.org/api/
BSD 3-Clause "New" or "Revised" License
63 stars 25 forks source link

Cache gene sets and allow for set operations #336

Open kshefchek opened 4 years ago

kshefchek commented 4 years ago

We are often asked to generate gene sets based on filters, that amount to generating gene sets and performing set operations on them. We could theoretically introduce this functionality to biolink, by caching sets and then passing operations as a REST param, as an example:

We generate and cache the following sets: A. All human protein coding genes B. Human protein coding genes with mouse ortholog C. Genes with ortholog-phenotypes in any species D. Genes with phenotypes in human E. Genes with ortholog-phenotypes in mouse

An example query: Get the list of genes that have a mouse ortholog and no phenotypes in human OR mouse but have a phenotype in some other species

Would be: A & B & C - D - E