Open HeChengHui opened 8 months ago
I think It's possible to make models work in real time, they are fast enough. But I didn't try to write a code for this.
@HeChengHui This paper was published few days ago and demonstrates music source separation with 23ms latency (but don't expect top quality results with that fast processing): https://arxiv.org/abs/2402.17701
Thank you @ZFTurbo for providing the training code. I have tested out UVR MDX-Net models and found them to be very good. However, i am unable to get the models to work in real time. I wonder if the models provided here are capable of vocal separation in real time if i were to train them, or do i have to make some kind of adjustments to their architecture?