Some doubts about VocalRemoverTrainingSet.do_aug() of dataset.py

Hello, tsurumeso (sorry for my bad english)

When I was debugging and reading the code, I saw some code I can not understand on dataset.py -> VocalRemoverTrainingSet.do_aug().

This method seems to do some data augmentation, but the code confuses me a bit

It looks like the X variable is the mixture and the y variable is the instrument, but on line:42 I found this code: y = spec_utils.aggressively_remove_vocal(X, y, self.reduction_weight)

aggressively_remove_vocal() looks like returns an inst with a tiny volume vocal, But WHY assign the result to the y(instrument) variable, shouldn't it be the X(mixture) variable? If so, won't the y variable be polluted by the vocal and affect the training results?

Can you tell me what aggressively_remove_vocal() do and why the return value is assigned to the y variable?

Thank you very much

tsurumeso / vocal-remover

Some doubts about VocalRemoverTrainingSet.do_aug() of dataset.py #114