Open ShengleiH opened 7 years ago
@ShengleiH What I mentioned in README is the summary of the paper written by Po-Sen. In my work, I didn't apply the circular shifting technique whereas I used cropping waveforms randomly at each step.
I found that the default TrainConfig.SECONDS is set to 30s, then if I use 'mir1k' dataset, in which the audio length ranges from 4 to 13 seconds, it seems that the randomly cropping technique doesn't work here since the cropping length is longer than the audio length. So I am wondering if this default TrainConfig.SECONDS is set to fit 'iKala' dataset? Should I shorten the TrainConfig.SECONDS to fit the 'mir1k' dataset? But the default configuration on mir1k sounds good. Thank you~
Hello, thank you for sharing codes. I am confused about the data augmentation.
You said "circularly shift the singing voice and mix them with the background music."
I want to ask how long does the 'circular shift step'? I have checked Po-Sen's Matlab code, he said the best 'circular shift step' is 10,000. And in his code, I found he shifted the music, actually, not the vocal.
So I am very confused about data augmentation.