MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
BSD 2-Clause "Simplified" License
3.28k stars 272 forks source link

Issue in NeMo MSDD diarization model #34

Closed Kirankumar2609 closed 1 year ago

Kirankumar2609 commented 1 year ago

While trying to diarize using the MSDD model, I get the below error:

PicklingError: Can't pickle <class 'nemo.collections.common.parts.preprocessing.collections.SpeechLabelEntity'>: attribute lookup SpeechLabelEntity on nemo.collections.common.parts.preprocessing.collections failed

Kindly acknowledge if anyone had encountered this and solved it.

corneliusgerico commented 1 year ago

I had that issue and fixed it by changing num_workers to 0.

smxsm commented 1 year ago

Yep, changing num_workers to 0 in helpers.py fixed it for me, too (running on M1 CPU)

jstoone commented 1 year ago

It's a common gotcha' that Jupyter notebooks are prone to being bad at multi-threaded work, then though the system might be able to handle multiple workers/threads. I think this might be because of the nature of notebooks themselves, since they're running a single Kernel. The last sentence is guesswork, but I've personally have had problems with multi-threaded work prior.

So I use notebooks for prototyping, and then once I'm happy with the results, I'll extract the code to a standalone script. :v: