Open MohammedMehdiTBER opened 2 years ago
mdx: trained only on MusDB HQ, winning model on track A at the MDX challenge. mdx_extra: trained with extra training data (including MusDB test set), ranked 2nd on the track B of the MDX challenge. mdx_q, mdx_extra_q: quantized version of the previous models. Smaller download and storage but quality can be slightly worse. mdx_extra_q is the default model used. SIG: where SIG is a single model from the model zoo.
by https://github.com/facebookresearch/demucs/blob/main/README.md
Forgive me for asking in here, but I find nothing but dead-ends elsewhere: Is there any way to actually install and utilize the competing models from Band-split RNN, which also achieves a 9.0 overall SDR?
I just want to try it and compare to Demucs, as it appears to be the only worthy competitor. I wonder what would happen if the Spectral only training data-set used for Band-split RNN was added into the training set of HT Demucs??
Looking forward to any possible user-accessible implementation of the Sparse Hybrid Transformer model. Also keenly interested in any new and improved models resulting from the 2023 SDX Challenge!
I came here to prevent myself from asking a duplicate question, and yet...
...the answer here isn't clear.
I still don't know which model is best for me to use.
I can certainly understand that one of the models says it takes 4X longer with better results. But I still don't know if that's the one with the best results.
I want the best results because i'm generating karaoke files for songs i may listen to 1000s of times, and I don't want the vocals to be as clear as possible when separated so they can be properly transcribed.
In theory speed matters, but the job is already going to take over a month to run, so maybe speed doesn't matter that much after all if I'm willing to wait that long.
Does my use case influence which is the best model? Is there a best model at least **FOR ME***?
I'd recommend using htdemucs_ft with shifts no less than 4 if you have enough time. You can switch off shifts if you don't have enough time then.
what do you mean by "shifts"?
I ended up using mdx_extra and htdemucs and mdx_extra maybe sounded better, but it was hard to tell.
I'd recommend using htdemucs_ft with shifts no less than 4 if you have enough time. You can switch off shifts if you don't have enough time then.
what is the different between 'htdemucs_ft' and 'htdemucs'?
htdemucs_ft is fine-tuned from htdemucs
❓ Questions
I want to know which one is the best: mdx_extra q or mdx extra? and what does q mean?