openvax / pyensembl

Python interface to access reference genome features (such as genes, transcripts, and exons) from Ensembl
Apache License 2.0
374 stars 65 forks source link

get_coordinates_from_exon_id() <enhancement> #160

Closed alec-djinn closed 8 years ago

alec-djinn commented 8 years ago

@iskandr Is there a way to get the coordinates of a given exon_id? If not, it would be very useful to have it as enhancement.

tavinathanson commented 8 years ago

@alec-djinn does this meet your needs?

exon = ensembl_release.exon_by_id(exon_id)
coordinates = (exon.contig, exon.start, exon.end)
alec-djinn commented 8 years ago

Yes! It's exactly what I was looking for. Although, the returned exon.start and exon.end value are probably sorted (start is always smaller than end) and do not take into account the orientation of the gene. for example:

from pyensembl import EnsemblRelease
data = EnsemblRelease(75)
exon = data.exon_by_id('ENSE00002051192')
coordinates = (exon.contig, exon.start, exon.end)

>>> coordinates
>>> ('17', 7590695, 7590799)

While if you check on the Ensembl website: http://grch37.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000141510;r=17:7565097-7590856;t=ENST00000420246

No. Exon/Intron Start End
1 ENSE00002051192 7,590,799 7,590,695
tavinathanson commented 8 years ago

@alec-djinn how about exon.strand?

alec-djinn commented 8 years ago

Sure, I can use that as well. Thanks!

tavinathanson commented 8 years ago

Cool, closing this issue since it seems resolved.