new 2nd 2-stems pretrained model (speech-voice|music) . [Feature]

deezer / spleeter

Deezer source separation library including pretrained models.

https://research.deezer.com/projects/spleeter.html

MIT License

25.92k stars 2.84k forks source link

new 2nd 2-stems pretrained model (speech-voice|music) . [Feature] #455

Open DI555 opened 4 years ago

DI555 commented 4 years ago

Description

i'd like to ask about a new 2nd 2-stems pretrained model , - spliting for speech-voice and music! It's highly wanted for audiobooks, - very often needed to split audiobook's stream! As for personal use for example only for for keeping speech-voice of an audiobook , and also for sites to let user on/off speech-voice/music separately!

DI555 commented 4 years ago

for those who didn't get what it for... , - it's for audiobook fans mainly !!! and who hates music in audiobook records!!! or who wants to split audiobooks, and/or making speech-voice or music louder/quieter !!!

androidfan415 commented 4 years ago

what's wrong with using the current 2stems model? use the vocals for the narration and accompaniment for the music, etc...

DI555 commented 4 years ago

i was doubt that current model is fine working for speech-voice of audio book stream... imo it's a big difference between speech-voice and singing-voice ...

nmstoker commented 4 years ago

@DI555 - your last point implies that you are basing this on your pre-conceived ideas about speech / singing voices rather than any evidence that there's a problem with using the 2stems model.

How are people meant to assess whether there's really a problem worth solving here if you haven't actually quantified the concern (ie it's merely your theoretical ponderings) - or am I wrong and you've got something tangible you've tried and merely forgot to mention it?

If there is a genuine problem with output then I'd be all for such a model (as I appreciate the value in your use-case) but when I've tried 2stems with a few podcasts, it works pretty well and so far I've yet to observe a problem with it.