databio / gtars

Performance-critical tools to manipulate, analyze, and process genomic interval data. Primarily focused on building tools for geniml - our genomic machine learning python package.
2 stars 1 forks source link

Fix pything bindings tokenization performance issues #22

Closed nleroy917 closed 1 month ago

nleroy917 commented 1 month ago

For a detailed description of the issue, please see #21

In short, expensive cloning of PyUniverse structs was leading to big performance degradation in terms of speed and memory. This makes small changes to fix that.