Open EssamWisam opened 1 year ago
I think I found that this parameter was behaving the same way as the learning_rate
in GBDT, also known as shrinkage. You also have the concept of shrunk covariance that based on the value of the shrinkage your "shrink" the empirical covariance to an identity matrix.
Here, you use the shrinkage to "shrink" towards the simple random oversampling. Then, it is true that it might be counter-intuitive that the value of shrinkage is actually the opposite of what you would expect :).
And the reason why this parameter is not in the paper is just because this single parameter can switch from the normal random over-sampling to ROSE without the need to create a dedicated class.
I was trying to implement ROSE (Random Oversampling Examples) in Julia, and after considering the paper I decided to look at imbalanced-learn's implementation. There seems to be a
shrinkage
parameter that has no mention in the paper. I see that its multiplied by the smoothing matrix used for the kernel as in here near line 214 and understand its effect; however, why the specific name "shrinkage"? Especially when it causes the points to spread farther apart...