biothings / mygene.info

MyGene.info: A BioThings API for gene annotations
http://mygene.info
Other
117 stars 20 forks source link

Latest update breaks/alters search by Ensembl ID behavior #133

Closed jaclynbeck-sage closed 2 years ago

jaclynbeck-sage commented 2 years ago

I've been using mygene to query by Ensembl ID, to get the fields "name", "symbol", "summary", and "alias". After the update on October 31, this functionality seems to be broken or seriously altered from what is expected. As an example, these two Ensembl IDs: ENSG00000186092 (matches to Entrez ID 79501) ENSG00000188976 (matches to Entrez ID 26155)

Last week when querying on the Ensembl IDs, I got the following information:

{
    "_id": "ENSG00000186092",
    "name": "olfactory receptor family 4 subfamily F member 5",
    "summary": "Olfactory receptors interact with odorant molecules in the nose, to initiate a neuronal response that triggers the perception of a smell. The olfactory receptor proteins are members of a large family of G-protein-coupled receptors (GPCR) arising from single coding-exon genes. Olfactory receptors share a 7-transmembrane domain structure with many neurotransmitter and hormone receptors and are responsible for the recognition and G protein-mediated transduction of odorant signals. The olfactory receptor gene family is the largest in the genome. The nomenclature assigned to the olfactory receptor genes and proteins for this organism is independent of other organisms. [provided by RefSeq, Jul 2008].",
    "symbol": "OR4F5"
}
{
"_id": "ENSG00000188976",
    "name": "NOC2 like nucleolar associated transcriptional repressor",
    "summary": "Histone modification by histone acetyltransferases (HAT) and histone deacetylases (HDAC) can control major aspects of transcriptional regulation. NOC2L represents a novel HDAC-independent inhibitor of histone acetyltransferase (INHAT) (Hublitz et al., 2005 [PubMed 16322561]).[supplied by OMIM, Mar 2008].",
    "symbol": "NOC2L",
    "alias": [
      "NET15",
      "NET7",
      "NIR",
      "PPP1R112"
    ]
}

However today, when I query the same Ensembl IDs, for both I get: {"code":404,"success":false,"error":"Not Found."}

If I query on the Entrez ID instead, I get all of the above information as expected, and the "Ensembl" field matches the expected Ensembl ID.

For a third ID that is found (ENSG00000000003), the information returned is missing the "alias" and "summary" fields even though they were returned last week: https://mygene.info/v3/gene/ENSG00000000003 Last week returned:

    "summary": "The protein encoded by this gene is a member of the transmembrane 4 superfamily, also known as the tetraspanin family. Most of these members are cell-surface proteins that are characterized by the presence of four hydrophobic domains. The proteins mediate signal transduction events that play a role in the regulation of cell development, activation, growth and motility. The protein encoded by this gene is a cell surface glycoprotein and is highly similar in sequence to the transmembrane 4 superfamily member 2 protein. It functions as a negative regulator of retinoic acid-inducible gene I-like receptor-mediated immune signaling via its interaction with the mitochondrial antiviral signaling-centered signalosome. This gene uses alternative polyadenylation sites, and multiple transcript variants result from alternative splicing. [provided by RefSeq, Jul 2013].",
    "alias": [
      "T245",
      "TM4SF6",
      "TSPAN-6"
    ],

however today both these fields are missing.

I have tried this both through the python library's getgenes() function and through the web interface, i.e. https://mygene.info/v3/gene/ENSG00000188976 with the same results.

newgene commented 2 years ago

@jaclynbeck-sage We are aware of this issue, a rollback release will be out shortly.

The cause is that today's data release includes an update from the latest Ensembl v108, looks like there is some issue with this release (unclear whether it's on Ensembl side or our side yet). We are rollbacking this release right now. Will update here once it's in place.

newgene commented 2 years ago

@jaclynbeck-sage we have rolled back the data release. MyGene.info API should be back to normal now.

Closing this issue now, and let us know if you are still experiencing the issue.

jaclynbeck-sage commented 2 years ago

Thank you so much for the fast response!