Open sensboston opened 10 months ago
Yes @sensboston , I agree with you too.
Hi @tsurumeso , your model works great, I have tested, your v6.0.0b2 model is significantly better than v4, and I have already integrated your v6.0.0b2 model into my PiKaraoke (https://github.com/xuancong84/pikaraoke) system.
However, I find that your model's performance on synthetic singing (e.g., Hatsune Miku (Project Diva), Kagamine Rin, Gumi, etc.) is much worse than human singing. I am not sure whether your training data contains synthetic songs or not, but I suggest you add more variety to your training data. Thanks a lot!
First of all, I wanna say a BIG THANKS to the devs, your app is working perfectly, I'm really excited! (except pip install torch should be added to requirements.txt - but it's not an issue at all)
From my googling, I've found a lot of commercial services using your solution (and without referencing!) but they aren't good as you :)
BTW, you're app is an "open gate" to create an ultimate karaoke application; the only thing I can't find it's how to convert song_Vocals.wav to the text with the time stamps (like a subtitles).
Could anyone suggest AI-powered solution for this task? Thanks a lot!