SYSTRAN / faster-whisper

Faster Whisper transcription with CTranslate2
MIT License
11.31k stars 945 forks source link

how to create the gradio UI with faster-whisper #957

Open kustcl opened 1 month ago

kustcl commented 1 month ago

when I try to do this the question is ceback (most recent call last): File "D:\anaconda\Lib\site-packages\gradio\routes.py", line 534, in predict output = await route_utils.call_process_api( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda\Lib\site-packages\gradio\route_utils.py", line 226, in call_process_api output = await app.get_blocks().process_api( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda\Lib\site-packages\gradio\blocks.py", line 1550, in process_api result = await self.call_function( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda\Lib\site-packages\gradio\blocks.py", line 1185, in call_function prediction = await anyio.to_thread.run_sync( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda\Lib\site-packages\anyio\to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda\Lib\site-packages\anyio_backends_asyncio.py", line 2134, in run_sync_in_workerthread return await future ^^^^^^^^^^^^ File "D:\anaconda\Lib\site-packages\anyio_backends_asyncio.py", line 851, in run result = context.run(func, args) ^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda\Lib\site-packages\gradio\utils.py", line 661, in wrapper response = f(args, **kwargs) ^^^^^^^^^^^^^^^^^^ File "D:\python\python代码\faster-whisper\faster-whisper-master\chatbot (2).py", line 17, in transcribe segments, = model.transcribe(y, beam_size=5) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\python\python代码\faster-whisper\faster-whisper-master\faster_whisper\transcribe.py", line 838, in transcribe audio = decode_audio(audio, sampling_rate=sampling_rate) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\python\python代码\faster-whisper\faster-whisper-master\faster_whisper\audio.py", line 26, in decode_audio waveform, audio_sf = torchaudio.load(input_file) # waveform: channels X T ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda\Lib\site-packages\torchaudio_backend\utils.py", line 205, in load return backend.load(uri, frame_offset, num_frames, normalize, channels_first, format, buffer_size) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda\Lib\site-packages\torchaudio_backend\soundfile.py", line 27, in load return soundfile_backend.load(uri, frame_offset, num_frames, normalize, channels_first, format) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda\Lib\site-packages\torchaudio_backend\soundfilebackend.py", line 221, in load with soundfile.SoundFile(filepath, "r") as file: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda\Lib\site-packages\soundfile.py", line 658, in init self._file = self._open(file, mode_int, closefd) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda\Lib\site-packages\soundfile.py", line 1212, in _open raise TypeError("Invalid file: {0!r}".format(self.name)) TypeError: Invalid file: (48000, array([ 0, 0, 0, ..., -43886, -56208, -60574], dtype=int32))

dodysw commented 3 weeks ago

Maybe take the audio path like this on the Gradio UI component definition, and use it as input/param to the mode.transcribe audio arg?

gr.Audio(type="filepath")