Open nathanodle opened 1 year ago
Sorry I have to agree with @nathanodle. When I tried this on a few basic examples, it did not work very well at all. I will try it a couple more times. Please @liuxubo717 provide us suggestions if you think we are doing something wrong.
"separate anything" is indeed an overstatement according to my test results. The only acceptable use case I found for this model is to separate speech (and not singing or vocals) that is already distinct from noise or background music
Tested thus far with various audio sources, tried separating noise from speech, music from vocals, etc... Results are mediocre at best, probably could achieve similar things with UVR5.
The result is extremly bad, please avoid use this to save time.
Reminder from an disappointed user.
Tried locally and didn't get much, then tried HF space and the model didn't really seperate anything. Was it overfit on the demo data? Barely any difference between input and output on a random song from my library with organ, drums, vocal.