Closed sipercai closed 2 days ago
Low CPU usage / low inference speed seems to be a fairly common issue on the pyannote repository, see https://github.com/pyannote/pyannote-audio/issues/1652, https://github.com/pyannote/pyannote-audio/issues/1566, https://github.com/pyannote/pyannote-audio/issues/1702 (and probably others). I have not made significant efforts to optimize the pipeline but the main fix I think you should try is :
(Pass a dict
to the pipeline instead of a ProtocolFile
)
Thank you very much for finding these materials for me. They are very helpful to me. I will carefully refer to the content in these links and hope to find solutions to the problem from them. If there are any new discoveries or questions during the research process, I will communicate with you in a timely manner. Thank you again for your help!
Description:
When running the speaker diarization pipeline, I notice that CPU usage is extremely high, which seems to prevent the GPU utilization from increasing significantly. This results in a bottleneck where the pipeline does not fully leverage the GPU for computation.
I suspect the issue might stem from the use of
AgglomerativeClustering
in the pipeline, as it operates on the CPU. It appears that the clustering process is consuming significant CPU resources, overshadowing the benefits of using the GPU for segmentation and embedding extraction.Code Example:
Here is the relevant code snippet where I set up the pipeline:
Observations:
AgglomerativeClustering
might be using the CPU for its calculations, which could explain the bottleneck.Suggestions/Questions:
AgglomerativeClustering
CPU-bound? If so, are there any GPU-accelerated alternatives for clustering in this context?Any guidance or recommendations on addressing this issue would be greatly appreciated!