Closed Bebra777228 closed 1 month ago
There is no such thing as "best" for all tracks and use cases.
The default settings passed in by the CLI are already designed to provide the "best" compromise between performance and resource usage for most inputs, but of course anyone can choose to play around with the settings and possibly get better results for a specific input track.
The type of model and how it was trained makes much more of a significant impact than any of these parameters, in my opinion.
I am mainly interested in VR and MDXC.
Why? If you want the best separation, these days the RoFormer models e.g. model_bs_roformer_ep_317_sdr_12.9755.ckpt
will give a much better result.
It's all subjective though!
In the demucs_params, the segment_size is set to 'Default'. I would like to understand what this means and what specific value 'Default' implies.
Additionally, it would be helpful to know at least the approximate minimum and maximum values for each parameter across different architectures.
Please share what settings should be used to achieve the highest quality separation of a song?
Perhaps you have already had experience solving similar tasks and know which parameters will provide the best result? I would be very grateful for your help!
I am mainly interested in VR and MDXC.