Closed Winchester37 closed 1 month ago
I think you are confusing pyannote's "models" (pyannote.audio.models.....
) and pyannote's "pipelines" (pyannote.audio.pipelines.....
).
The model that you finetune/train is the 'segmentation' model, it performs the speaker diarization task on duration=5.0 seconds windows.
To obtain the final diarization output on a whole audio file, we need to aggregate multiple outputs of this local segmentation model, see paper pyannote.audio 2.1 speaker diarization pipeline: principle, benchmark, and recipe for more details about it.
There may be examples in a pyannote tutorial notebook, but I can't remember which one, so here is a pretty complete notebook about training a model and testing its pipeline (in particular the "Adapted pipeline output" section).
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Tested versions
pyannote.audio 3.1.1
System information
Windows 11 - pyannote.audio 3.1.1
Issue description
I have successfully fine-tuned a Pyannote Audio model for speaker diarization using a custom dataset and now I'm facing difficulties testing the fine-tuned model. Despite following the documentation and adjusting the paths for the model checkpoint and configuration file, I encounter errors when attempting to test the model on a new audio file.
Here's the training code snippet I used for fine-tuning: