chapmanb / bcbb

Incubator for useful bioinformatics code, primarily in Python and R
http://bcbio.wordpress.com
604 stars 243 forks source link

tabix access to gff? #92

Closed danielecook closed 9 years ago

danielecook commented 9 years ago

Is there a way to query gff by region with the gff parser thats installed with pip? I have written a small wrapper for tabix that makes use of the _gff_line_map() function and the subprocess module to return gff records.

I did notice this: access_gff_index.py

However, its not included in the pip installed module unless I am missing something, and my gff files are all indexed with tabix as well.

I can create a pull request if you think this would be beneficial.

chapmanb commented 9 years ago

Daniel; Thanks for the thoughts and sorry about the delay in getting back with you. There isn't a clean way to do this right now and the access_gff_index.py approach is pretty old so I agree the newer tabix approaches make more sense.

However, if you want to do this I'd suggest looking at GFFutils:

https://github.com/daler/gffutils

It puts the GFF into a SQLite database so should make this style of regional query much easier. Happy to accept pull requests if you think GFFutils will not do what you need. Hope this helps some.