jarredou / MVSEP-MDX23-Colab_v2

Colab adaptation of MVSep Model for MDX23 music separation contest
223 stars 40 forks source link

Question about future development of the model #17

Closed Bebra777228 closed 2 months ago

Bebra777228 commented 2 months ago

Hi! I really enjoyed your Colab Notebook and I have a couple of questions:

  1. Is there a plan to implement a feature for adding custom separation models?
  2. Will DeEcho-DeReverb and DeNoise models be added for more efficient separation?

Thank you in advance for your answer!

jarredou commented 2 months ago

Most of the upgrades are done after a new good model has been publicly released. I don't have the ressources to train good models myself so evolution will continue this way.

I don't have plans to implement VR arch models, I could probably add the FoxJoy dereverb model from UVR easily tho (as it's MDX-Net based and the code is already implemented for VocFT and InstHQ4, but hopefully better models for that task will be released this year.

deton24 commented 2 months ago

You can use these VR models here https://huggingface.co/spaces/r3gm/Ultimate-Vocal-Remover-WebUI

sob., 13 kwi 2024, 20:12 użytkownik Jarredou @.***> napisał:

Most of the upgrades are done after a new good model has been publicly release. I don't have the ressources to train good models myself so evolution will continue this way.

I don't have plans to implement VR arch models, I could probably add the FoxJoy dereverb model from UVR easily tho, but hopefully better models for that task will be released this year.

— Reply to this email directly, view it on GitHub https://github.com/jarredou/MVSEP-MDX23-Colab_v2/issues/17#issuecomment-2053720370, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIJ3EHESDIAWYACE362JGZDY5FYQXAVCNFSM6AAAAABGFPRQ2WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJTG4ZDAMZXGA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Bebra777228 commented 2 months ago

With each new model, the load on the GPU increases and can reach 13-14 GB. This leads to a slowdown in the processing process. I would like to know if optimization will be implemented to reduce the load on the GPU?

jarredou commented 2 months ago

You can lower BigShifts parameter to reduce the number of passes by model, I've already lowered the new default value (it was at 7 in previous version), the slowdown is mainly because BS-Roformer processing is 3x slower than MDX23C (InstVoc).

I'll see to add a "low memory" setting. It should be easy & quick to add but it will only influence vocals/instrumentals separation. For the 4-stem separation, when Demucs is used, it would need a full code rewrite, and I didn't have time to do that for latest update. It's on my todolist for next version, but I don't know when I'll have time to spend on this.