MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
BSD 2-Clause "Simplified" License
2.53k stars 243 forks source link

RuntimeError: false INTERNAL ASSERT FAILED #105

Closed Toby1091 closed 9 months ago

Toby1091 commented 9 months ago

I got the following error with several files.

python diarize.py -a audio_fi.mp3 --whisper-model large-v2

Traceback (most recent call last):
  File "/Users/toby1091/git/audio_transcription/whisper-diarization2/diarize.py", line 130, in <module>
    msdd_model.diarize()
  File "/Users/toby1091/git/audio_transcription/whisper-diarization2/.venv_diarization2/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/Users/toby1091/git/audio_transcription/whisper-diarization2/.venv_diarization2/lib/python3.10/site-packages/nemo/collections/asr/models/msdd_models.py", line 1180, in diarize
    self.clustering_embedding.prepare_cluster_embs_infer()
  File "/Users/toby1091/git/audio_transcription/whisper-diarization2/.venv_diarization2/lib/python3.10/site-packages/nemo/collections/asr/models/msdd_models.py", line 699, in prepare_cluster_embs_infer
    self.emb_sess_test_dict, self.emb_seq_test, self.clus_test_label_dict, _ = self.run_clustering_diarizer(
  File "/Users/toby1091/git/audio_transcription/whisper-diarization2/.venv_diarization2/lib/python3.10/site-packages/nemo/collections/asr/models/msdd_models.py", line 866, in run_clustering_diarizer
    scores = self.clus_diar_model.diarize(batch_size=self.cfg_diar_infer.batch_size)
  File "/Users/toby1091/git/audio_transcription/whisper-diarization2/.venv_diarization2/lib/python3.10/site-packages/nemo/collections/asr/models/clustering_diarizer.py", line 456, in diarize
    all_reference, all_hypothesis = perform_clustering(
  File "/Users/toby1091/git/audio_transcription/whisper-diarization2/.venv_diarization2/lib/python3.10/site-packages/nemo/collections/asr/parts/utils/speaker_utils.py", line 486, in perform_clustering
    cluster_labels = speaker_clustering.forward_infer(
  File "/Users/toby1091/git/audio_transcription/whisper-diarization2/.venv_diarization2/lib/python3.10/site-packages/nemo/collections/asr/parts/utils/offline_clustering.py", line 1326, in forward_infer
    Y = spectral_model.forward(affinity_mat)
  File "/Users/toby1091/git/audio_transcription/whisper-diarization2/.venv_diarization2/lib/python3.10/site-packages/nemo/collections/asr/parts/utils/offline_clustering.py", line 788, in forward
    labels = self.clusterSpectralEmbeddings(X, cuda=self.cuda, device=self.device)
  File "/Users/toby1091/git/audio_transcription/whisper-diarization2/.venv_diarization2/lib/python3.10/site-packages/nemo/collections/asr/parts/utils/offline_clustering.py", line 813, in clusterSpectralEmbeddings
    spectral_emb = self.getSpectralEmbeddings(affinity, n_spks=self.n_clusters, cuda=cuda)
  File "/Users/toby1091/git/audio_transcription/whisper-diarization2/.venv_diarization2/lib/python3.10/site-packages/nemo/collections/asr/parts/utils/offline_clustering.py", line 843, in getSpectralEmbeddings
    _, diffusion_map_ = eigDecompose(laplacian, cuda=cuda, device=affinity_mat.device)
  File "/Users/toby1091/git/audio_transcription/whisper-diarization2/.venv_diarization2/lib/python3.10/site-packages/nemo/collections/asr/parts/utils/offline_clustering.py", line 554, in eigDecompose
    lambdas, diffusion_map = eigh(laplacian)
RuntimeError: false INTERNAL ASSERT FAILED at "/Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/BatchLinearAlgebra.cpp":1540, please report a bug to PyTorch. linalg.eigh: Argument 8 has illegal value

Maybe relevant for fixing this: https://github.com/pytorch/pytorch/issues/97656 https://github.com/pytorch/pytorch/issues/83818