This method seems to do some data augmentation, but the code confuses me a bit
It looks like the X variable is the mixture and the y variable is the instrument, but on line:42 I found this code:
y = spec_utils.aggressively_remove_vocal(X, y, self.reduction_weight)
aggressively_remove_vocal() looks like returns an inst with a tiny volume vocal, But WHY assign the result to the y(instrument) variable, shouldn't it be the X(mixture) variable? If so, won't the y variable be polluted by the vocal and affect the training results?
Can you tell me what aggressively_remove_vocal() do and why the return value is assigned to the y variable?
Hello, tsurumeso (sorry for my bad english)
When I was debugging and reading the code, I saw some code I can not understand on dataset.py -> VocalRemoverTrainingSet.do_aug().
This method seems to do some data augmentation, but the code confuses me a bit
It looks like the
X
variable is the mixture and they
variable is the instrument, but on line:42 I found this code:y = spec_utils.aggressively_remove_vocal(X, y, self.reduction_weight)
aggressively_remove_vocal() looks like returns an inst with a tiny volume vocal, But WHY assign the result to the
y(instrument)
variable, shouldn't it be theX(mixture)
variable? If so, won't they
variable be polluted by the vocal and affect the training results?Can you tell me what aggressively_remove_vocal() do and why the return value is assigned to the
y
variable?Thank you very much