Closed HIN0209 closed 5 years ago
Thanks to ask a question! The program should run on that folder structure. Actually, a melody file and a vocal file do not have to be a pair. The train program chooses a audio file from each directory, and it loads desired length from chosen file, and it makes mixture data on the fly.
I will enhance readme with a example soon.
@arity-r Thanks for commenting. Can you clarify further? What you say means that the wav files in the respective folders are mixed in a random way (e.g., vocal1 & melody5, instead of vocal1-melody1, vocal2-melody2), correct? My dataset (environmental noise, etc) would be fine with the random combination, but I am wondering if mixing voice and melody from different songs may make sense.
@HIN0209
What you say means that the wav files in the respective folders are mixed in a random way (e.g., vocal1 & melody5, instead of vocal1-melody1, vocal2-melody2), correct?
Correct!
I am wondering if mixing voice and melody from different songs may make sense.
My aim was to extract speaking voice instead of singing voice. So I did not think about this matter.
Thank you again. Let me pursue with my dataset and see how it goes.
Hello! It is easy to use and seems to work. Before pursuing more with my own data, please guide me on the data placement. I placed the wav. files as shown below. Is this the correct way to process melody1 and vocal1, etc as pairs?? Adding more readme would be appreciated. Thanks!!
root- | |--melody-- |-melody1.wav |-melody2.wav |--vocal
|-vocal1.wav |-vocal2.wav