databio / gtars

Performance-critical tools to manipulate, analyze, and process genomic interval data. Primarily focused on building tools for geniml - our genomic machine learning python package.
2 stars 1 forks source link

Tokenizers should sort universe files before creating `region_to_id` maps #4

Open nleroy917 opened 7 months ago

nleroy917 commented 7 months ago

Otherwise it hinders reproducibility