pyannote / pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
http://pyannote.github.io
MIT License
6.26k stars 774 forks source link

speaker-diarization-3.1 high memory usage #1580

Open metalgearsloth opened 11 months ago

metalgearsloth commented 11 months ago

So for context: pyannote/speaker-diarization works as intended. pyannote/speaker-diarization-3.0 had the CPU issue pyannote/speaker-diarization-3.1 goes on my GPU however memory usage is over double what it was before. I've tried different versions of pytorch from 2.0 onwards (via pip) but it still happens and onnxruntime is definitely not installed. Going from ~6GB memory to 14GB means performance tanks as I only have 12GB of dedicated memory, from a few minutes to diarize to an hour for the same file.

hbredin commented 11 months ago

You may want to try and reduce pipeline.embedding_batch_size that defaults to 32.

Majdoddin commented 6 months ago

with pipeline.embedding_batch_size=1, it used some ~0.5GB less RAM. interestingly, runtime also improved significantly!