JuliaML / MLDataUtils.jl

Utility package for generating, loading, splitting, and processing Machine Learning datasets
http://mldatautilsjl.readthedocs.io/
Other
102 stars 20 forks source link

Stratified creation of batches, sampling, k-fold cross-validation #24

Closed pevnak closed 7 years ago

pevnak commented 7 years ago

Hi All, I do wonder if Batch iterators implements stratified iteration. Based on my experiments it does not, which I think this missing feature is bottleneck in practical use, especially for highly imbalanced data. May-be this issue is suggestion for further improvement.

Best wishes, Tomas

Evizero commented 7 years ago

Hi! yes, you are correct. It is already on the roadmap. Now that MLLabelUtils is stable we have the tools to implement it and I'll get to it as soon as time allows. The dev branch is already pretty close to getting merged

Thanks for your feedback

pevnak commented 7 years ago

Thanks for the answer. I am looking forward for it. Best wishes, Tomas