NAN's lead entrez ids to be considered as floats

jrderuiter / pybiomart

A simple pythonic interface to biomart.

MIT License

53 stars 11 forks source link

@samwindels, just a comment on how the parsing is working, ENSG00000285114 maps to 56169. If it mapped to 561690 you'd get a float that looks like: 561690.0.

Here's an example:

In [5]: dataset.query(attributes=['ensembl_gene_id', 'entrezgene'], filters={'link_ensembl_gene_id':
   ...:  ['ENSG00000099725','ENSG00000185115', 'ENSG00000285363']})                                 
Out[5]: 
    Gene stable ID  NCBI gene ID
0  ENSG00000099725        5616.0
1  ENSG00000185115       56160.0
2  ENSG00000285363           NaN

As a work around, you should be able to get to the correct identifiers (as strings) with:

result["entrezgene"].apply(lambda x: "{:.0f}".format(x))

jrderuiter / pybiomart

NAN's lead entrez ids to be considered as floats #8