5-fold cross-validation of continuous target variable that has a highly-skewed distribution compared to normal distribution

YyzHarry / imbalanced-regression

[ICML 2021, Long Talk] Delving into Deep Imbalanced Regression

MIT License

806 stars 128 forks source link

Hi - For StratifiedKFold module in sklearn, the document indicates that: The folds are made by preserving the percentage of samples for each class. So to replicate it in regression problems, a simple way is to divide the target range into discrete bins and calculate the #samples in each bin (the resolution might be based on each specific problem). Then you do the same thing by viewing bins as classes for fold division.

Since the question is not quite related to the repo, I'm closing it for now. Feel free to comment if you have any related questions.

YyzHarry / imbalanced-regression

5-fold cross-validation of continuous target variable that has a highly-skewed distribution compared to normal distribution #12