databio / AIList

Augmented interval list (AIList): a novel data structure for efficient genomic interval search
http://ailist.databio.org
GNU General Public License v2.0
21 stars 5 forks source link

I am interested in maintainaning a Python wrapper #7

Closed endrebak closed 5 years ago

endrebak commented 5 years ago

Hi — I am interested in maintaining a Python wrapper for your AIList C code. My main interest in maintaining the AIList is to use it as the fundamental overlap data structure in pyranges, but also to give back to the python data science community.

The C code needs to be slightly modified for this to work:

1) the AIList currently knows about chromosomes. This is bad for general usage; it should ideally just be like an interval tree so it can be used outside of genomics (or for other species like drosophila, or with other annotation schemes like ensembl - 1 and X instead of chr1 and chrX.)

2) the AIList code currently seems to complect querying the AILIST with building it, so that it needs to be rebuilt for each query (see the AIListIntersect function.)

Is this something you might be interested in changing?

jf2016GH commented 5 years ago

Thank you for your suggestions. I just re-implemented AIList and it is now more general and significantly faster. A python wrapper is also included in src_py folder.

nsheff commented 5 years ago

Hi @endrebak we're definitely interested in helping you use ailist in pyranges -- please reopen this issue if these updates don't solve your concerns. thanks!

endrebak commented 5 years ago

Thanks, I will look at them when I get back from vacation :)