biothings / mygene.info

MyGene.info: A BioThings API for gene annotations
http://mygene.info
Other
113 stars 20 forks source link

include "gene_type" from Ensembl #28

Closed newgene closed 6 years ago

newgene commented 6 years ago

Ensembl provides "gene_type" annotations for their Ensembl Genes, they can be retrieved via BioMart query. This is an example to get the mapping from Ensembl Gene ID to "gene_type":

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query  virtualSchemaName = "default" formatter = "TSV" header = "0" uniqueRows = "0" count = "" datasetConfigVersion = "0.6" >

    <Dataset name = "hsapiens_gene_ensembl" interface = "default" >
        <Attribute name = "ensembl_gene_id" />
        <Attribute name = "gene_biotype" />
    </Dataset>
</Query>

We can integrate "gene_type" values into the current Ensembl data ingestion process.

newgene commented 6 years ago

We can have this new field as "ensembl.type_of_gene". The current root-level "type_of_gene" still remains as the NCBI "type_of_gene" values.

sirloon commented 6 years ago

Fixed as of commit e988bb3ffbab6be51fc72f1b22a6655bec5b8296 Data including type_of_gene should be available next release this WE (included in upcoming Ensembl release 91)

Mapping is the same as type_of_gene for Entrez:

            "type_of_gene" : {
                    "include_in_all" : false,
                    "index" : "not_analyzed",
                    "type" : "string"
            },
sirloon commented 6 years ago

fix type_of_gene under "/ensembl" 08ec1e903111e03fb55ae306031b566f3adf4001