pyannote / pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
http://pyannote.github.io
MIT License
6.03k stars 758 forks source link

Speaker Diarizations get_segmentations() raises for several input type variants #1684

Open bschreck opened 6 months ago

bschreck commented 6 months ago

Tested versions

System information

macOs 13.6 - pyannote 3.1 - M2 air

Issue description

2 variants that according to the docstring should be correct inputs instead raise a ValueError:

pipeline=Pipeline.from_pretrained(
            "pyannote/speaker-diarization-3.1", use_auth_token=os.environ["HF_API_KEY"]
        )

audio = Audio()({'waveform': waveform, 'sample_rate': sample_rate}) # waveform is 2d numpy array
segmentations = pipeline.get_segmentations(audio)  # raises
segmentations = pipeline.get_segmentations({'waveform': waveform, 'sample_rate': sample_rate})  # raises
segmentations = pipeline.get_segmentations({'waveform': torch.from_numpy(waveform), 'sample_rate': sample_rate})  # succeeds

Minimal reproduction example (MRE)

see above

stale[bot] commented 15 hours ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.