jrderuiter / pybiomart

A simple pythonic interface to biomart.
MIT License
55 stars 11 forks source link

Using entrez gene id as filter #15

Open DeepaMahm opened 5 years ago

DeepaMahm commented 5 years ago

I tried the following to filter based on entrez gene id

server = Server(host='http://www.ensembl.org')

dataset = (server.marts['ENSEMBL_MART_ENSEMBL']
                 .datasets['hsapiens_gene_ensembl'])
filters = dataset.list_filters()
pprint(filters)
out = dataset.query(attributes=['ensembl_gene_id', 'external_gene_name'],
              filters={'entrezgene': ['3098']})
pprint(out)

The following error occurs, pybiomart.base.BiomartException: Unknown filter entrezgene, check dataset filters for a list of valid filters.

entrez gene is not available in dataset.list_filters() too .

Using gene_id also doesn't work. Any alternate ways?

DeepaMahm commented 5 years ago

In R this works,

library("biomaRt")                                                                                                                   
listMarts()                                                                                                                           
ensembl <- useMart("ensembl",dataset="hsapiens_gene_ensembl")                                                                         
filters = listFilters(ensembl)                                                                                                        
entrezgene = c("3098","728642")             
genes <- getBM(filters="entrezgene_id", attributes=c("ensembl_gene_id","entrezgene_id"), values=entrezgene, mart=ensembl)                                                                                                                 
print(genes)

I intend to do the same in python.

vitkl commented 4 years ago

Same problem here! Looks like only ENSEMBL ids (link_ensembl_gene_id) can be used as a filter. You can work around that by defining no filters, just requesting the attributes you need and then filtering yourself.