Open bschreck opened 3 months ago
Okay I dug through the code and see that the actual start/ends are created later in to_diarization or to_annotatin.
However, trying to diarize the new audio file this way using existing clusters (with the same speaker- me) results in totally different (and very bad) annotations compared to just running the pretrained pipeline on the file directly. Running by itself produces this set of segments:
DiarizationSegment(
speaker="SPEAKER_04", start=1.1370997453310672, end=2.461378183361628
),
DiarizationSegment(
speaker="SPEAKER_00", start=4.193126910016975, end=5.466471561969438
),
DiarizationSegment(
speaker="SPEAKER_04", start=5.755096349745333, end=6.4172355687606135
),
DiarizationSegment(
speaker="SPEAKER_00", start=8.182940152801354, end=10.271225382003397
),
DiarizationSegment(
speaker="SPEAKER_04", start=11.35781281833616, end=12.953738115449912
),
DiarizationSegment(
speaker="SPEAKER_04", start=13.344230475382002, end=14.51570755517827
),
While doing the method I described with existing clusters gives me:
[DiarizationSegment(speaker='SPEAKER_00', start=5.00909375, end=5.75159375), DiarizationSegment(speaker='SPEAKER_01', start=5.75159375, end=6.443468750000001), DiarizationSegment(speaker='SPEAKER_00', start=6.443468750000001, end=6.59534375)]```
This is totally different
Tested versions
3.1
System information
macOs 13.6 - pyannote 3.1 - M2 air
Issue description
Im running ``` self.pipeline = Pipeline.from_pretrained( "pyannote/speaker-diarization-3.1", use_auth_token=os.environ["HF_API_KEY"] ) segmentations = self.pipeline.get_segmentations({'waveform': torch.from_numpy(waveform), 'sample_rate': sample_rate}) splits = [(segment, data) for segment, data in segmentations]