MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
BSD 2-Clause "Simplified" License
3.45k stars 291 forks source link

RuntimeError: false INTERNAL ASSERT FAILED #116

Closed Toby1091 closed 11 months ago

Toby1091 commented 11 months ago

While the fix of issue #105 has resolved the issue for all my files up to now, the internal assert error popped up again with the most recent file:

python diarize.py -a audio_fi.mp3 --whisper-model large-v2

** On entry to SSYEVD, parameter number  8 had an illegal value
clustering:   0%|                                                                                                                                                                                                     | 0/1 [00:03<?, ?it/s]
Traceback (most recent call last):
  File "/Users/Toby1091/git/audio_transcription/whisper-diarization3/diarize.py", line 132, in <module>
    msdd_model.diarize()
  File "/Users/Toby1091/git/audio_transcription/whisper-diarization3/.venv_diarization3/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/Users/Toby1091/git/audio_transcription/whisper-diarization3/.venv_diarization3/lib/python3.10/site-packages/nemo/collections/asr/models/msdd_models.py", line 1180, in diarize
    self.clustering_embedding.prepare_cluster_embs_infer()
  File "/Users/Toby1091/git/audio_transcription/whisper-diarization3/.venv_diarization3/lib/python3.10/site-packages/nemo/collections/asr/models/msdd_models.py", line 699, in prepare_cluster_embs_infer
    self.emb_sess_test_dict, self.emb_seq_test, self.clus_test_label_dict, _ = self.run_clustering_diarizer(
  File "/Users/Toby1091/git/audio_transcription/whisper-diarization3/.venv_diarization3/lib/python3.10/site-packages/nemo/collections/asr/models/msdd_models.py", line 866, in run_clustering_diarizer
    scores = self.clus_diar_model.diarize(batch_size=self.cfg_diar_infer.batch_size)
  File "/Users/Toby1091/git/audio_transcription/whisper-diarization3/.venv_diarization3/lib/python3.10/site-packages/nemo/collections/asr/models/clustering_diarizer.py", line 456, in diarize
    all_reference, all_hypothesis = perform_clustering(
  File "/Users/Toby1091/git/audio_transcription/whisper-diarization3/.venv_diarization3/lib/python3.10/site-packages/nemo/collections/asr/parts/utils/speaker_utils.py", line 486, in perform_clustering
    cluster_labels = speaker_clustering.forward_infer(
  File "/Users/Toby1091/git/audio_transcription/whisper-diarization3/.venv_diarization3/lib/python3.10/site-packages/nemo/collections/asr/parts/utils/offline_clustering.py", line 1326, in forward_infer
    Y = spectral_model.forward(affinity_mat)
  File "/Users/Toby1091/git/audio_transcription/whisper-diarization3/.venv_diarization3/lib/python3.10/site-packages/nemo/collections/asr/parts/utils/offline_clustering.py", line 788, in forward
    labels = self.clusterSpectralEmbeddings(X, cuda=self.cuda, device=self.device)
  File "/Users/Toby1091/git/audio_transcription/whisper-diarization3/.venv_diarization3/lib/python3.10/site-packages/nemo/collections/asr/parts/utils/offline_clustering.py", line 813, in clusterSpectralEmbeddings
    spectral_emb = self.getSpectralEmbeddings(affinity, n_spks=self.n_clusters, cuda=cuda)
  File "/Users/Toby1091/git/audio_transcription/whisper-diarization3/.venv_diarization3/lib/python3.10/site-packages/nemo/collections/asr/parts/utils/offline_clustering.py", line 843, in getSpectralEmbeddings
    _, diffusion_map_ = eigDecompose(laplacian, cuda=cuda, device=affinity_mat.device)
  File "/Users/Toby1091/git/audio_transcription/whisper-diarization3/.venv_diarization3/lib/python3.10/site-packages/nemo/collections/asr/parts/utils/offline_clustering.py", line 554, in eigDecompose
    lambdas, diffusion_map = eigh(laplacian)
RuntimeError: false INTERNAL ASSERT FAILED at "/Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/BatchLinearAlgebra.cpp":1540, please report a bug to PyTorch. linalg.eigh: Argument 8 has illegal value. Most certainly there is a bug in the implementation calling the backend library.
ChristianSch commented 3 months ago

I get the exact same error but this ticket was closed without any resolution or comment. @MahmoudAshraf97 is there anything you could tell me that would help me get to the bottom of this?

clustering:   0%|                                                                                                                       | 0/1 [00:00<?, ?it/s]** On entry to SSYEVD, parameter number  8 had an illegal value
clustering:   0%|                                                                                                                       | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "diarize.py", line 176, in <module>
    msdd_model.diarize()
  File "/opt/homebrew/Caskroom/miniconda/base/envs/whisper-dia/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/whisper-dia/lib/python3.8/site-packages/nemo/collections/asr/models/msdd_models.py", line 1180, in diarize
    self.clustering_embedding.prepare_cluster_embs_infer()
  File "/opt/homebrew/Caskroom/miniconda/base/envs/whisper-dia/lib/python3.8/site-packages/nemo/collections/asr/models/msdd_models.py", line 699, in prepare_cluster_embs_infer
    self.emb_sess_test_dict, self.emb_seq_test, self.clus_test_label_dict, _ = self.run_clustering_diarizer(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/whisper-dia/lib/python3.8/site-packages/nemo/collections/asr/models/msdd_models.py", line 866, in run_clustering_diarizer
    scores = self.clus_diar_model.diarize(batch_size=self.cfg_diar_infer.batch_size)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/whisper-dia/lib/python3.8/site-packages/nemo/collections/asr/models/clustering_diarizer.py", line 456, in diarize
    all_reference, all_hypothesis = perform_clustering(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/whisper-dia/lib/python3.8/site-packages/nemo/collections/asr/parts/utils/speaker_utils.py", line 486, in perform_clustering
    cluster_labels = speaker_clustering.forward_infer(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/whisper-dia/lib/python3.8/site-packages/nemo/collections/asr/parts/utils/offline_clustering.py", line 1326, in forward_infer
    Y = spectral_model.forward(affinity_mat)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/whisper-dia/lib/python3.8/site-packages/nemo/collections/asr/parts/utils/offline_clustering.py", line 788, in forward
    labels = self.clusterSpectralEmbeddings(X, cuda=self.cuda, device=self.device)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/whisper-dia/lib/python3.8/site-packages/nemo/collections/asr/parts/utils/offline_clustering.py", line 813, in clusterSpectralEmbeddings
    spectral_emb = self.getSpectralEmbeddings(affinity, n_spks=self.n_clusters, cuda=cuda)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/whisper-dia/lib/python3.8/site-packages/nemo/collections/asr/parts/utils/offline_clustering.py", line 843, in getSpectralEmbeddings
    _, diffusion_map_ = eigDecompose(laplacian, cuda=cuda, device=affinity_mat.device)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/whisper-dia/lib/python3.8/site-packages/nemo/collections/asr/parts/utils/offline_clustering.py", line 554, in eigDecompose
    lambdas, diffusion_map = eigh(laplacian)
RuntimeError: false INTERNAL ASSERT FAILED at "/Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/BatchLinearAlgebra.cpp":1540, please report a bug to PyTorch. linalg.eigh: Argument 8 has illegal value. Most certainly there is a bug in the implementation calling the backend library.
MahmoudAshraf97 commented 3 months ago

Hey, as mentioned in the error stack, please report to nemo or pytorch, unfortunately that's not a thing I can fix in my code

ChristianSch commented 3 months ago

Ok thanks! There seems to be a bug report over here from '21: https://github.com/pytorch/pytorch/issues/68291

Toby1091 commented 3 months ago

I wasn't able to fix it but a reliable workaround has been to split the audio files in 10min chunks and then copy-paste it together again. Hope this helps

ChristianSch commented 2 months ago

FYI: I switched to whisperx and it works perfectly

Hoohm commented 2 months ago

FYI: I switched to whisperx and it works perfectly

Could you help me a bit more on what exactly you switched to whisperx, I have the same issue. Running macos 14.5 on m1 silicon

Thank you

edit: Or did you mean you just use whisperx instead or this project?

ChristianSch commented 2 months ago

@Hoohm yup, I switched to whisperx completely instead of this project.