HazyResearch / hyena-dna

Official implementation for HyenaDNA, a long-range genomic foundation model built with Hyena
https://arxiv.org/abs/2306.15794
Apache License 2.0
574 stars 82 forks source link

BUG: Using Path.iterdir() for classification labels #48

Open alancleary opened 7 months ago

alancleary commented 7 months ago

Path.iterdir() yields paths in an arbitrary order so it needs to be sorted when using it to assign labels for a classifier.

I got burned by this when repurposing the GenomicBenchmarkDataset class for my own data sets. We had to swap the positive and negative samples in the test data for them to get the correct classification labels! This PR fixes that.