SociallyIneptWeeb / AICoverGen

A WebUI to create song covers with any RVC v2 trained AI voice from YouTube videos or audio files.
MIT License
1k stars 233 forks source link

Inference is not using GPU #108

Open smartinezbragado opened 6 months ago

smartinezbragado commented 6 months ago

Hello,

I deployed the inference pipeline in a GPU provider. However, the song generation takes too long (5 mins for a 4 min song), which is much longer that what I am reading in the threads. I found that the GPU is barely used in the generation. Probably that is the issue.

Do you know what I might be missing?

Thanks

JackismyShephard commented 5 months ago

What I have noticed is that what is really taking up a lot of time is the preprocessing of songs, i.e. vocal-instruments separation, vocal-background-vocal separation and vocal denoising. Under the hood this is done using MDX-net models. I tried performing the same conversions in the UVR app manually and it is noticeably faster, so there might be room for some improvement here. I am working on it myself (but progress is slow).