pyannote / pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
http://pyannote.github.io
MIT License
6.25k stars 773 forks source link

Can I download a separate audio file for each speaker? #1570

Open girayyagmur opened 11 months ago

girayyagmur commented 11 months ago

First of all, thank you for providing a good module.

If speaker1 speaks from 00:03 to 00:10 and speaker2 speaks from 00:07 to 00:15. Two voices will be mixed from 00:07 to 00:10.

But i want to get seperated voice files. file1 : speaker1.mp3 (00:03 ~ 00:10)(without speaker2 voice.) file2 : speaker2.mp3 (00:07 ~ 00:15)(without speaker1 voice.)

Is it impossible in pyannote?!

girayyagmur commented 11 months ago

This is not the answer I expected

hbredin commented 11 months ago

What you are looking for is speaker separation, not speaker diarization. pyannote does not do that... yet... but we are working on it!

In the meantime, you might want to have a look at asteroid

girayyagmur commented 11 months ago

Thank you for your answer. I tried the Asteroid library, but it wasn't that effective. I am looking forward to your work. Good luck

wasabi9 commented 7 months ago

has there been any further work on this?

SentinalMax commented 2 months ago

has there been any further work on this?

I'm wondering the same.

hbredin commented 2 months ago

We have recently released a separation pipeline: https://huggingface.co/pyannote/speech-separation-ami-1.0

Note that it has been trained on a small dataset (AMI) so might not generalize well to other conditions.