nomadkaraoke / python-audio-separator

Easy to use stem (e.g. instrumental/vocals) separation from CLI or as a python package, using a variety of amazing pre-trained models (primarily from UVR)
MIT License
514 stars 86 forks source link

Minimum and maximum values of architectural parameters #138

Closed Bebra777228 closed 3 weeks ago

Bebra777228 commented 1 month ago

In the demucs_params, the segment_size is set to 'Default'. I would like to understand what this means and what specific value 'Default' implies.

Additionally, it would be helpful to know at least the approximate minimum and maximum values for each parameter across different architectures.


I have already written this text as part of another issue, but since there has been no response to it for several days, I decided to duplicate it in a new issue and close the old one =)

beveradb commented 4 weeks ago

I'd like to take a moment to encourage you (and anyone else) to read the code and figure out the answer yourself when you have a question like this rather than just raising an issue 😄

Even better, if you want to contribute back to the community, you can raise a PR improving the documentation or write up a post explaining your findings once you've dug into the code to figure it out!

The mental load of maintaining this repository for free is already more than I would like, which is why you get very slow responses from me and why I would prefer to help folks help themselves rather than expecting me to answer.

If you try your best to figure it out but get stuck / struggle, I'm much happier to help teach you how things work / how to help yourself, but it would be helpful if you could do the initial attempt and write up your findings / explain where you get stuck and then ask me for help pointing you in the right direction to unstick you!

Anyway, for this specific case you asked about (demucs), I'll answer the question for you and show you my approach:


I started out by opening the repo in vscode and searching for the string Default across the whole codebase. That gave only a few results, giving us a few good starts for where to look:

image

So, in short, the default value is 40. Hope this process helps you understand how you can answer questions like this yourself in future 😄

Bebra777228 commented 3 weeks ago

Oh yes, I have already resolved this issue. While waiting for a response, I completely forgot about it =)