sevagh / demucs.cpp

C++17 port of Demucs v3 (hybrid) and v4 (hybrid transformer) models with ggml and Eigen3
https://freemusicdemixer.com/
MIT License
93 stars 12 forks source link

Two stem model #19

Closed PiootrK closed 4 months ago

PiootrK commented 4 months ago

Is there a way to use a two-stem model, like --two-stems vocals option in demucs?

sevagh commented 4 months ago

It shouldn't be too difficult to add new code to support this - but I need a weights file to work from to incorporate the slightly different tensor shapes. Do you have a link to a weights file?

So far I have only implemented inference for 4- or 6-source pretrained models (htdemucs, htdemucs_ft, htdemucs_6s, and hdemucs_mmi).

PiootrK commented 4 months ago

I tried to find it and it looks like the two stem version is in fact 4-source, but with sources mixed and gives no performance benefits, so I am closing the issue.

Thank you very much for your response, the project is very nicely written. I will probably end up using original demucs in Python, as it works faster (which is a shocker for me, I am compiling C++ on latest Visual C++, but still multithreaded build runs 50% longer than Python), but I appreciate what you have done here.

sevagh commented 4 months ago

The Python version of Demucs used PyTorch, which is a Python wrapper around the core C++ tensor library of Torch: https://pytorch.org/cppdocs/installing.html

Libtorch (the core parts of PyTorch) is a very sophisticated tensor library written in C++. I'm not surprised it's faster than this code.

In demucs.cpp, I rewrote all of the tensor operations myself. That way, it doesn't depend on a huge library like Libtorch (so it's easier to compile for WebAssembly, Android, etc.). But I don't think I would beat the speed of Libtorch and their professional engineering team. I also aim for low memory usage by avoiding large matrix multiplications (which means slower runtime).

So yes, "C++ is faster than Python", but most scientific Python libraries (numpy, pytorch, scipy, etc.) use C/C++/FORTRAN or other compiled languages under the hood with a thin Python library on top.