tsurumeso / vocal-remover

Vocal Remover using Deep Neural Networks
MIT License
1.47k stars 215 forks source link

Great job! Thanks a lot! But I need a suggestion... #171

Open sensboston opened 5 months ago

sensboston commented 5 months ago

First of all, I wanna say a BIG THANKS to the devs, your app is working perfectly, I'm really excited! (except pip install torch should be added to requirements.txt - but it's not an issue at all)

From my googling, I've found a lot of commercial services using your solution (and without referencing!) but they aren't good as you :)

BTW, you're app is an "open gate" to create an ultimate karaoke application; the only thing I can't find it's how to convert song_Vocals.wav to the text with the time stamps (like a subtitles).

Could anyone suggest AI-powered solution for this task? Thanks a lot!

xuancong84 commented 5 months ago

Yes @sensboston , I agree with you too.

Hi @tsurumeso , your model works great, I have tested, your v6.0.0b2 model is significantly better than v4, and I have already integrated your v6.0.0b2 model into my PiKaraoke (https://github.com/xuancong84/pikaraoke) system.

However, I find that your model's performance on synthetic singing (e.g., Hatsune Miku (Project Diva), Kagamine Rin, Gumi, etc.) is much worse than human singing. I am not sure whether your training data contains synthetic songs or not, but I suggest you add more variety to your training data. Thanks a lot!