Closed cuttlefishh closed 1 year ago
I'm going to close this as out of scope for redbiom. Redbiom is specifically tasked with storing, searching and fetching sample data and metadata. I hesitate to increase the scope of the project to include subsampling regimes, but rather I'd like to encourage a downstream tool to implement that.
In the EMP paper, we created a subset of 2000 samples with even representation across empo_3 categories (subset_2k) to make the dataset less biased toward certain empo categories and the trading cards more meaningful. It would be nice of Redbiom could do this too. One could imagine: