Dadangdut33 / Speech-Translate

A realtime speech transcription and translation application using Whisper OpenAI and free translation API. Interface made using Tkinter. Code written fully in Python.
MIT License
454 stars 56 forks source link

Speech Translate not producing the four output files. #83

Open Gingseng17 opened 2 months ago

Gingseng17 commented 2 months ago

I ran the transcription process as usual in Speech Translate but it failed to produce the output files, .csv, .srt, .tsv, .txt. I tested it mutliple times and still no files created. I looked at the log file but since I am not a programmer/developer, I am not sure how to rectify the problem. Below is the log/error info. Hoping someone will be able to help me out. Thanks in advance.

2024-07-09 20:21:24.508 | INFO | main.py:2152 [MainThread] - App Version: 1.3.10 - TIME: 2024-07-09 20:21:24 2024-07-09 20:21:24.508 | INFO | main.py:2153 [MainThread] - OS: Windows 10 10.0.19045 | CPU: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel 2024-07-09 20:21:24.523 | DEBUG | main.py:2154 [MainThread] - Sys args: ['C:\Users\lekwan\AppData\Local\Programs\Speech Translate CPU\SpeechTranslate.exe'] 2024-07-09 20:21:24.523 | DEBUG | main.py:2155 [MainThread] - Loading UI... 2024-07-09 20:21:25.212 | INFO | main.py:208 [MainThread] - Tray created successfully 2024-07-09 20:21:27.953 | DEBUG | main.py:281 [MainThread] - Available Theme to use: ['vista', 'sun-valley-light', 'sun-valley-dark', 'winnative', 'clam', 'alt', 'default', 'classic', 'xpnative'] 2024-07-09 20:21:27.953 | DEBUG | style.py:32 [MainThread] - Setting theme: sun-valley-light 2024-07-09 20:21:27.984 | DEBUG | style.py:60 [MainThread] - Setting custom light theme style 2024-07-09 20:21:33.532 | DEBUG | main.py:713 [Thread-7 (check_ffmpeg)] - Checking ffmpeg... 2024-07-09 20:21:33.689 | DEBUG | main.py:715 [Thread-7 (check_ffmpeg)] - Checking ffmpeg done 2024-07-09 20:21:55.120 | INFO | main.py:299 [Thread-3 (cuda_check)] - GPU: No GPU detected | CUDA: CUDA is not available! Using CPU instead 2024-07-09 20:22:09.870 | INFO | about.py:123 [MainThread] - Checking for update on start 2024-07-09 20:22:10.370 | INFO | about.py:150 [Thread-18 (req_update_check)] - Checking for update... 2024-07-09 20:22:11.668 | INFO | about.py:167 [Thread-18 (req_update_check)] - No update available 2024-07-09 20:25:54.921 | INFO | helper.py:37 [MainThread] - Checking model name 2024-07-09 20:25:54.921 | DEBUG | helper.py:38 [MainThread] - modelKey: 🌀 Medium [5GB VRAM] (Accurate), src_english: True 2024-07-09 20:25:54.921 | DEBUG | helper.py:43 [MainThread] - modelName: medium.en 2024-07-09 20:25:54.921 | DEBUG | main.py:1539 [MainThread] - Running disabler... 2024-07-09 20:26:02.409 | DEBUG | download.py:110 [MainThread] - Connecting to huggingface server to verify model 2024-07-09 20:26:03.253 | DEBUG | main.py:1617 [MainThread] - Running enabler... 2024-07-09 20:26:12.643 | INFO | file.py:624 [Thread-19 (process_file)] - Start Process (FILE) 2024-07-09 20:26:12.643 | DEBUG | load.py:366 [Thread-19 (process_file)] - Mode load args get: {'device': 'cpu', 'success': True} 2024-07-09 20:26:12.643 | DEBUG | load.py:489 [Thread-19 (process_file)] - Loading model for transcribe using faster-whisper 2024-07-09 20:26:38.187 | DEBUG | load.py:539 [Thread-19 (process_file)] - Model loaded | Is Faster Whisper: True | Load Status: 2024-07-09 20:26:38.296 | DEBUG | load.py:540 [Thread-19 (process_file)] - TC: Set 2024-07-09 20:26:38.296 | DEBUG | load.py:541 [Thread-19 (process_file)] - TL: Not Set 2024-07-09 20:26:38.312 | DEBUG | load.py:542 [Thread-19 (process_file)] - func_tc: Set 2024-07-09 20:26:38.312 | DEBUG | load.py:543 [Thread-19 (process_file)] - func_tl: Not Set 2024-07-09 20:26:38.312 | DEBUG | load.py:439 [Thread-19 (process_file)] - Pass kwarg: 2024-07-09 20:26:38.312 | DEBUG | load.py:440 [Thread-19 (process_file)] - {'temperature': (0.0, 0.2, 0.4, 0.6, 0.8, 1.0), 'best_of': 3, 'beam_size': 3, 'patience': 1.0, 'compression_ratio_threshold': 2.4, 'logprob_threshold': -1.0, 'no_speech_threshold': 0.72, 'suppress_tokens': None, 'suppress_blank': True, 'initial_prompt': None, 'prefix': None, 'condition_on_previous_text': True, 'max_initial_timestamp': 1.0, 'fp16': True} 2024-07-09 20:26:38.801 | DEBUG | load.py:366 [Thread-19 (process_file)] - Mode transcribe args get: {'word_timestamps': True, 'regroup': True, 'suppress_silence': True, 'suppress_word_ts': True, 'q_levels': 20, 'k_size': 5, 'demucs': False, 'demucs_output': None, 'demucs_options': None, 'vad': False, 'vad_threshold': 0.35, 'vad_onnx': False, 'min_word_dur': 0.1, 'only_voice_freq': False, 'beam_size': 3, 'best_of': 3, 'patience': 1.0, 'no_speech_threshold': 0.72, 'compression_ratio_threshold': 2.4, 'condition_on_previous_text': True, 'initial_prompt': None, 'prefix': None, 'suppress_blank': True, 'suppress_tokens': None, 'max_initial_timestamp': 1.0, 'threads': 0, 'success': True} 2024-07-09 20:26:38.817 | DEBUG | language.py:269 [Thread-19 (process_file)] - GETTING WHISPER LANGUAGE FROM SIMILAR LANGUAGE NAME 2024-07-09 20:26:38.817 | DEBUG | language.py:274 [Thread-19 (process_file)] - Found key english while searching for english 2024-07-09 20:26:38.817 | DEBUG | language.py:275 [Thread-19 (process_file)] - FULL KEY GET ['english'] 2024-07-09 20:26:39.297 | INFO | file.py:652 [Thread-19 (process_file)] - Model Args: {'device': 'cpu', 'download_root': 'C:\Users\lekwan\.cache\whisper'} 2024-07-09 20:26:39.313 | INFO | file.py:653 [Thread-19 (process_file)] - Process Args: {'word_timestamps': True, 'regroup': True, 'suppress_silence': True, 'suppress_word_ts': True, 'q_levels': 20, 'k_size': 5, 'demucs': False, 'demucs_output': None, 'demucs_options': None, 'vad': False, 'vad_threshold': 0.35, 'vad_onnx': False, 'min_word_dur': 0.1, 'only_voice_freq': False, 'beam_size': 3, 'best_of': 3, 'patience': 1.0, 'no_speech_threshold': 0.72, 'compression_ratio_threshold': 2.4, 'condition_on_previous_text': True, 'initial_prompt': None, 'prefix': None, 'suppress_blank': True, 'suppress_tokens': None, 'max_initial_timestamp': 1.0, 'language': 'en'} 2024-07-09 20:26:41.960 | DEBUG | file.py:802 [Thread-19 (process_file)] - FILE PROCESSING: C:/Users/lekwan/Documents/Lenna/2.0 Canada/QRomana/Transcription&Subtitling/Testing/Test Speech Translate/2024-07-01 Test Why No Output/Fake lethal injections or leave military.m4a 2024-07-09 20:26:41.960 | DEBUG | file.py:810 [Thread-19 (process_file)] - Save_name: 2024-07-09 960752 Fake lethal injections or leave military/{task-lang} 2024-07-09 20:26:42.117 | DEBUG | file.py:862 [Thread-19 (process_file)] - saved metadata 2024-07-09 20:26:42.163 | INFO | file.py:294 [Thread-23 (cancellable_tc)] - -------------------------------------------------- 2024-07-09 20:26:42.195 | INFO | file.py:295 [Thread-23 (cancellable_tc)] - Transcribing 2024-07-09 20:26:42.195 | DEBUG | file.py:296 [Thread-23 (cancellable_tc)] - Source Language: english 2024-07-09 20:26:42.273 | INFO | _logging.py:55 [Thread-24 (run_whisper)] - Running Whisper transcribe... 2024-07-09 20:26:52.844 | ERROR | _logging.py:62 [Thread-24 (run_whisper)] - Detected Language: english 2024-07-09 20:26:52.844 | ERROR | _logging.py:62 [Thread-24 (run_whisper)] - Transcribing with faster-whisper (medium.en)... 2024-07-09 20:26:52.897 | INFO | _logging.py:55 [Thread-24 (run_whisper)] - Transcribe: 0% | ## | 0/102.77 [00:00<?, ?sec/s] 2024-07-09 20:27:51.611 | INFO | _logging.py:55 [Thread-24 (run_whisper)] - Transcribe: 11% | ### | 10.96/102.77 [00:58<08:09, 5.33s/sec] 2024-07-09 20:27:51.747 | INFO | _logging.py:55 [Thread-24 (run_whisper)] - Transcribe: 11% | ### | 11.74/102.77 [00:58<07:25, 4.89s/sec] 2024-07-09 20:28:33.634 | INFO | _logging.py:55 [Thread-24 (run_whisper)] - Transcribe: 29% | ### | 29.44/102.77 [01:40<03:39, 3.00s/sec] 2024-07-09 20:29:08.823 | INFO | _logging.py:55 [Thread-24 (run_whisper)] - Transcribe: 56% | ### | 57.62/102.77 [02:15<01:25, 1.90s/sec] 2024-07-09 20:29:38.974 | INFO | _logging.py:55 [Thread-24 (run_whisper)] - Transcribe: 83% | ### | 85.1/102.77 [02:46<00:26, 1.52s/sec] 2024-07-09 20:29:39.068 | INFO | _logging.py:55 [Thread-24 (run_whisper)] - Transcribe: 100% | #### | 102.77/102.77 [02:46<00:00, 1.62s/sec] 2024-07-09 20:29:58.056 | INFO | _logging.py:55 [Thread-24 (run_whisper)] - Whisper transcribe done 2024-07-09 20:29:58.389 | DEBUG | language.py:269 [Thread-23 (cancellable_tc)] - GETTING WHISPER LANGUAGE FROM SIMILAR LANGUAGE NAME 2024-07-09 20:29:58.421 | DEBUG | language.py:274 [Thread-23 (cancellable_tc)] - Found key english while searching for english 2024-07-09 20:29:58.421 | DEBUG | language.py:275 [Thread-23 (cancellable_tc)] - FULL KEY GET ['english'] 2024-07-09 20:29:58.483 | ERROR | file.py:391 [Thread-23 (cancellable_tc)] - can only concatenate str (not "int") to str Traceback (most recent call last):

File "D:\Codes_Projects\Python\Speech-Translate\speech_translate\utils\audio\file.py", line 335, in cancellable_tc

File "D:\Codes_Projects\Python\Speech-Translate.venvcpu\Lib\site-packages\stable_whisper\result.py", line 1632, in remove_repetition

TypeError: can only concatenate str (not "int") to str 2024-07-09 20:30:16.841 | INFO | file.py:908 [Thread-19 (process_file)] - End process (FILE) [Total time: 217.54s] 2024-07-09 20:32:02.524 | INFO | main.py:1842 [Thread-19 (process_file)] - Stopping file import processing... 2024-07-09 20:32:02.790 | INFO | main.py:1861 [Thread-19 (process_file)] - Stopped