Open Sharrnah opened 1 year ago
Accept user conditions of Both Models
And add the user token during downloading the model from pre-trianed
from pyannote.audio import Pipeline
pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization",
use_auth_token="TOKEN HERE")
Visit hf.co/pyannote/speaker-diarization
- hf.co/pyannote/segmentation
Accept user conditions of Both Models
And add the user token during downloading the model from pre-trianed
from pyannote.audio import Pipeline pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization", use_auth_token="TOKEN HERE")
The whole reason to do offline loading is that you do not need a token for every user of the software. I can't expect that every user will create a huggingface account etc.
your/path/to/pyannote/speaker-diarization/config.yaml
pipeline:
name: pyannote.audio.pipelines.SpeakerDiarization
params:
clustering: AgglomerativeClustering
embedding: your/path/to/speechbrain/spkrec-ecapa-voxceleb # Folder, must contains `speechbrain` keyword.
embedding_batch_size: 32
embedding_exclude_overlap: true
segmentation: your/path/to/pyannote/segmentation/pytorch_model@2.1.bin # File
segmentation_batch_size: 32
params:
clustering:
method: centroid
min_cluster_size: 15
threshold: 0.7153814381597874
segmentation:
min_duration_off: 0.5817029604921046
threshold: 0.4442333667381752
pyannote/audio/pipelines/speaker_verification.py
(version 2.1.1) self.classifier_ = SpeechBrain_EncoderClassifier.from_hparams(
source=self.embedding,
savedir=self.embedding if Path(self.embedding).exists() else f"{CACHE_DIR}/speechbrain",
run_opts={"device": self.device},
use_auth_token=use_auth_token,
)
from pyannote.audio import Pipeline
pipeline = Pipeline.from_pretrained("your/path/to/pyannote/speaker-diarization/config.yaml")
thanks. but still no luck. I placed the spkrec-ecapa-voxceleb stuff besides the pipeline config and changed the config accordingly:
pipeline:
name: pyannote.audio.pipelines.SpeakerDiarization
params:
clustering: AgglomerativeClustering
embedding: speechbrain/spkrec-ecapa-voxceleb
embedding_batch_size: 32
embedding_exclude_overlap: true
segmentation: pytorch_model.bin
segmentation_batch_size: 32
params:
clustering:
method: centroid
min_cluster_size: 15
threshold: 0.7153814381597874
segmentation:
min_duration_off: 0.5817029604921046
threshold: 0.4442333667381752
and changed the speaker_verification.py like you mentioned.
You need to download from speechbrain/spkrec-ecapa-voxceleb to the local speechbrain/spkrec-ecapa-voxceleb
directory.
classifier.ckpt embedding_model.ckpt hyperparams.yaml label_encoder.ckpt mean_var_norm_emb.ckpt
Hello @hbredin ,
Would a PR containing the following point be accepted for offline model loading ?
2. Edit `pyannote/audio/pipelines/speaker_verification.py`(version 2.1.1)
self.classifier_ = SpeechBrain_EncoderClassifier.from_hparams( source=self.embedding, savedir=self.embedding if Path(self.embedding).exists() else f"{CACHE_DIR}/speechbrain", run_opts={"device": self.device}, use_auth_token=use_auth_token, )
The modified line is savedir=self.embedding if Path(self.embedding).exists() else f"{CACHE_DIR}/speechbrain",
at https://github.com/pyannote/pyannote-audio/blob/11b56a137a578db9335efc00298f6ec1932e6317/pyannote/audio/pipelines/speaker_verification.py#L260
I'd gladly have a look at a PR facilitating the offline use of pyannote. Would be nice to also update the related part of the documentation.
I will take a look and submit a PR.
The tutorial doesn't work? Because I'm getting the same error when running it:
from pyannote.audio import Pipeline
pipeline = Pipeline.from_pretrained('pyannote/speaker-diarization', use_auth_token=True)
Error:
model.eval()
return model
AttributeError: 'NoneType' object has no attribute 'eval'
Hi @hbredin
I am also facing similar issue with offline usage with VAD config.
pyannote.audio version == 2.1.1
Config
pipeline: name: pyannote.audio.pipelines.VoiceActivityDetection params: segmentation: pytorch_model.bin params: min_duration_off: 0.09791355693027545 min_duration_on: 0.05537587440407595 offset: 0.4806866463041527 onset: 0.8104268538848918
Code pipeline = Pipeline.from_pretrained(f"vad_config.yaml")
Error: AttributeError: 'NoneType' object has no attribute 'eval'
Hello. Are there plans to make the offline use of speaker-diarization-3.0 pipeline work? I tried above suggestions to no avail.
pyannote
models and pipelines have always been usable offline.
The documentation is just... missing.
Also, feel free to make a PR improving the documentation!
except that this is exactly what i tried some time ago without success.
Would have to try it again to see if it works now or if i was just missing something else in detail.
So yes. an updated documentation would help a lot if someone gets this to work and would update it.
issue is still there as of today, the model is not found for some reason and a none value is returned, can someone look into this please.
I did not see this issue when proposing the respective PR: https://github.com/pyannote/pyannote-audio/pull/1682
please check if the new tutorial addresses these issues: https://github.com/pyannote/pyannote-audio/blob/develop/tutorials/community/offline_usage_speaker_diarization.ipynb
I solved this by accepting both conditions as mentioned in the readme
- Edit
your/path/to/pyannote/speaker-diarization/config.yaml
pipeline: name: pyannote.audio.pipelines.SpeakerDiarization params: clustering: AgglomerativeClustering embedding: your/path/to/speechbrain/spkrec-ecapa-voxceleb # Folder, must contains `speechbrain` keyword. embedding_batch_size: 32 embedding_exclude_overlap: true segmentation: your/path/to/pyannote/segmentation/pytorch_model@2.1.bin # File segmentation_batch_size: 32 params: clustering: method: centroid min_cluster_size: 15 threshold: 0.7153814381597874 segmentation: min_duration_off: 0.5817029604921046 threshold: 0.4442333667381752
- Edit
pyannote/audio/pipelines/speaker_verification.py
(version 2.1.1)self.classifier_ = SpeechBrain_EncoderClassifier.from_hparams( source=self.embedding, savedir=self.embedding if Path(self.embedding).exists() else f"{CACHE_DIR}/speechbrain", run_opts={"device": self.device}, use_auth_token=use_auth_token, )
- Speaker diarization
from pyannote.audio import Pipeline pipeline = Pipeline.from_pretrained("your/path/to/pyannote/speaker-diarization/config.yaml")
按照你的方法,还是报错huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name'
it's my config.yaml
version: 3.1.0
pipeline:
name: pyannote.audio.pipelines.SpeakerDiarization
params:
clustering: AgglomerativeClustering
#embedding: pyannote/wespeaker-voxceleb-resnet34-LM
embedding: /path/to/models/hbredin/wespeaker-voxceleb-resnet34-LM/speaker-embedding.onnx
embedding_batch_size: 32
embedding_exclude_overlap: true
#segmentation: pyannote/segmentation-3.0
segmentation: /path/to/models/pyannote/segmentation-3.0/pytorch_model.bin
segmentation_batch_size: 32
params:
clustering:
method: centroid
min_cluster_size: 12
threshold: 0.7045654963945799
segmentation:
min_duration_off: 0.0
I am not sure if i am missing something. I followed the documentation in how to load a pipeline for speaker diarization offline.
i followed this description: https://github.com/pyannote/pyannote-audio/blob/develop/tutorials/applying_a_pipeline.ipynb
But used the config.yaml from https://huggingface.co/pyannote/speaker-diarization instead of the VAD that is used in the offline use section since i want to use speaker diarization and not voice activity detection. (thats a bit confusing since at the top of that description, speaker diarization is used.)
i try to load it like this:
but i get the following error: