Is it possible to use transformers.js to implement audio source separation tasks?

asasas234 commented 3 weeks ago

Question

Hello, I have a beginner's question.

I want to implement the task of removing the human voice from the audio in the video and retaining the background sound in the browser. The idea is to load the model for audio source separation related to transformers.js to achieve the separation of the background sound and human voice, and then only return the background sound.

But I couldn't find relevant examples in the documentation, so I was wondering if this can be implemented? If so, what are the learning or research paths?

Looking forward to your reply

xenova commented 3 weeks ago

Hi there 👋 This library serves as a JavaScript port of the Python transformers library, so if you know of a model where you can do this, we can certainly look into it! Is something like https://huggingface.co/speechbrain/sepformer-wham what you're looking for?

asasas234 commented 3 weeks ago

Yes, speechbrain looks good, but I think demucs would be the best. However, I'm a beginner in machine learning and Python, so I'm looking for the simplest solution that can achieve my goal

xenova / transformers.js

Is it possible to use transformers.js to implement audio source separation tasks? #788

Question