tomchang25 / whisper-auto-transcribe

Auto transcribe tool based on whisper
MIT License
216 stars 14 forks source link

Error during processing (ValueError: Expected parameter logits) #49

Open amerodeh opened 1 year ago

amerodeh commented 1 year ago

Hi, I was running a translation and during it i had a crash. The command I ran was python C:\Users\san-a\Downloads\tools\whisper-auto-transcribe-0.3.2b2\cli.py "D:\j\X.mp4" --output "D:\j\X.srt" -lang ja --task translate --model-size large --device cuda Only change in repo I've made is I've set vocal_extracter=False in task.py because it didn't start otherwise. Stacktrace: 43%|██████████████████████████████▋ | 2698.92/6231.83 [04:35<06:00, 9.79sec/s] Traceback (most recent call last): File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe-0.3.2b2\cli.py", line 139, in <module> cli() File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe-0.3.2b2\cli.py", line 121, in cli subtitle_path = transcribe( File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe-0.3.2b2\src\utils\task.py", line 156, in transcribe result = used_model.transcribe( File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe-0.3.2b2\venv\lib\site-packages\stable_whisper\whisper_word_level.py", line 453, in transcribe_stable result: DecodingResult = decode_with_fallback(mel_segment, ts_token_mask=ts_token_mask) File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe-0.3.2b2\venv\lib\site-packages\stable_whisper\whisper_word_level.py", line 337, in decode_with_fallback decode_result, audio_features = model.decode(seg, File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe-0.3.2b2\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe-0.3.2b2\venv\lib\site-packages\stable_whisper\decode.py", line 112, in decode_stable result = task.run(mel) File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe-0.3.2b2\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe-0.3.2b2\venv\lib\site-packages\whisper\decoding.py", line 729, in run tokens, sum_logprobs, no_speech_probs = self._main_loop(audio_features, tokens) File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe-0.3.2b2\venv\lib\site-packages\stable_whisper\decode.py", line 61, in _main_loop tokens, completed = self.decoder.update(tokens, logits, sum_logprobs) File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe-0.3.2b2\venv\lib\site-packages\whisper\decoding.py", line 276, in update next_tokens = Categorical(logits=logits / self.temperature).sample() File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe-0.3.2b2\venv\lib\site-packages\torch\distributions\categorical.py", line 64, in __init__ super(Categorical, self).__init__(batch_shape, validate_args=validate_args) File "C:\Users\san-a\Downloads\tools\whisper-auto-transcribe-0.3.2b2\venv\lib\site-packages\torch\distributions\distribution.py", line 55, in __init__ raise ValueError( ValueError: Expected parameter logits (Tensor of shape (1, 51865)) of distribution Categorical(logits: torch.Size([1, 51865])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values: tensor([[nan, nan, nan, ..., nan, nan, nan]], device='cuda:0')

amerodeh commented 1 year ago

Tried rerunning, now instead of 43% it crashed at 7%. I'm still using version 0.2.0 day to day, so far it's been the most stable one. I wanna move to newer versions but random issues like above are preventing me.

Edit: It didn't happen when using whisper_timestamps instead of stable_whisper, the audio I'm converting is about 2hrs long. So probably #41 isn't solved yet?

nitesh31mishra commented 1 year ago

Can you try with another video file and check if the same error you are getting again?