Closed kamineko16 closed 1 month ago
seems a failure with whisper asr pipeline. have you tried with input manual the reference text for prompt audio?
seems a failure with whisper asr pipeline. have you tried with input manual the reference text for prompt audio?
Tried now, errors: " Traceback (most recent call last): File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\gradio\queueing.py", line 536, in process_events response = await route_utils.call_process_api( File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\gradio\route_utils.py", line 322, in call_process_api output = await app.get_blocks().process_api( File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\gradio\blocks.py", line 1935, in process_api result = await self.call_function( File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\gradio\blocks.py", line 1520, in call_function prediction = await anyio.to_thread.run_sync( # type: ignore File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\anyio\to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\anyio_backends_asyncio.py", line 2441, in run_sync_in_worker_thread return await future File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\anyio_backends_asyncio.py", line 943, in run result = context.run(func, args) File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\gradio\utils.py", line 826, in wrapper response = f(args, **kwargs) File "E:\F5-TTS\gradio_app.py", line 66, in infer final_wave, final_sample_rate, combined_spectrogram = infer_process( File "E:\F5-TTS\model\utils_infer.py", line 214, in infer_process return infer_batch_process( File "E:\F5-TTS\model\utils_infer.py", line 307, in infer_batch_process final_wave = generated_waves[0] IndexError: list index out of range "
I tried using whisper manually, and it works fine. Gave me the text of the "country.flac" with 100% accuracy.
Hi @kamineko16 some possible solutions:
pip install gradio==4.44.1
pcm_s16le
e.g.. will have
visuals like this if successfully uploadedHi @kamineko16 some possible solutions:
- try lower gradio version if >=5,
pip install gradio==4.44.1
- check for audio format, should with
pcm_s16le
e.g.. will have visuals like this if successfully uploaded- maybe force re-pull the repo?
Hi @SWivid , answers to your suggestions:
It is already 4.44.1, so I guess it installed with the requirements as 4.44.1.
I used the sample from the F5-TTS sample folder, and I guess it is in the right format? However, I tried what you said anyway. Here are pictures of both: Original Flac file:
Same audio, but Wav file with pcm_s16le:
Errors:
"
C:\Users{username}\anaconda3\envs\f5\lib\site-packages\transformers\models\whisper\generation_whisper.py:496: FutureWarning: The input name inputs
is deprecated. Please make sure to use input_features
instead.
warnings.warn(
You have passed task=transcribe, but also have set forced_decoder_ids
to [[1, None], [2, 50359]] which creates a conflict. forced_decoder_ids
will be ignored in favor of task=transcribe.
C:\Users{username}\anaconda3\envs\f5\lib\site-packages\transformers\models\whisper\modeling_whisper.py:599: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:455.)
attn_output = torch.nn.functional.scaled_dot_product_attention(
Passing a tuple of past_key_values
is deprecated and will be removed in Transformers v4.43.0. You should pass an instance of EncoderDecoderCache
instead, e.g. past_key_values=EncoderDecoderCache.from_legacy_cache(past_key_values)
.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask
to obtain reliable results.
Traceback (most recent call last):
File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\gradio\queueing.py", line 536, in process_events
response = await route_utils.call_process_api(
File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\gradio\route_utils.py", line 322, in call_process_api
output = await app.get_blocks().process_api(
File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\gradio\blocks.py", line 1935, in process_api
result = await self.call_function(
File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\gradio\blocks.py", line 1520, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\anyio\to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\anyio_backends_asyncio.py", line 2441, in run_sync_in_worker_thread
return await future
File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\anyio_backends_asyncio.py", line 943, in run
result = context.run(func, args)
File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\gradio\utils.py", line 826, in wrapper
response = f(args, **kwargs)
File "E:\F5-TTS\gradio_app.py", line 66, in infer
final_wave, final_sample_rate, combined_spectrogram = infer_process(
File "E:\F5-TTS\model\utils_infer.py", line 214, in infer_process
return infer_batch_process(
File "E:\F5-TTS\model\utils_infer.py", line 307, in infer_batch_process
final_wave = generated_waves[0]
IndexError: list index out of range
"
@kamineko16 it's very weird we haven't met this before, seems you're being the first one encountering this, sadge
will inference-cli.py
work from your side?
Hi @SWivid
@SWivid huh weird indeed, it's indeed working with inference-cli.py , so now I know everything is basically fine and the issue is with gradio_app.py ? Should I close this case or you want to check it more to understand why it's not working specifically with gradio_app.py ?
Hi @kamineko16 , I have no idea of what's going wrong, as I could not reproduce the error. We could just leave it open for somedays, see if someone also encounter this and are able to fix.
Hi @SWivid
@SWivid huh weird indeed, it's indeed working with inference-cli.py , so now I know everything is basically fine and the issue is with gradio_app.py ? Should I close this case or you want to check it more to understand why it's not working specifically with gradio_app.py ?
No. Problem is with whisper or transformer's pipeline. I have not tested on gradio, but my application is producing same warning. Though it generates output
Hi @SWivid
@SWivid huh weird indeed, it's indeed working with inference-cli.py , so now I know everything is basically fine and the issue is with gradio_app.py ? Should I close this case or you want to check it more to understand why it's not working specifically with gradio_app.py ?
No. Problem is with whisper or transformer's pipeline. I have not tested on gradio, but my application is producing same warning. Though it generates output
I managed to make it work with gradio_app interface by asking GPT to merge gradio_app and inference-cli, however, because the code is too long and I didn't pay for GPT it managed to make only the basic function without podcast and emotions. However, for some reason, it still gets slightly better results with just inference-cli.
will close this issue, feel free to open if further questions~
Hi, I did everything fine and still get errors. I tried reinstalling from zero 3 times, cuda is 11.8 and I have ffmpeg the latest version. I updated pip, I updated Anaconda. My GPU driver is updated. I'm using Windows 10. I also tried to purge the cache without results. I installed Python 3.10 in the environment.
At first, I thought it was a file issue, so I tried a bunch of different allowed formats, but after the second re-installation, I just tried the sample audio you provided and still got exactly the same errors. I installed all using the requirements file. I also tried on my last attempt to move the git folder from admin driver C to regular one E, didn't help. I tried to run it with administration rights and didn't help.
This is all the errors: " C:\Users{username}\anaconda3\envs\f5\lib\site-packages\transformers\models\whisper\generation_whisper.py:496: FutureWarning: The input name inputs is deprecated. Please make sure to use input_features instead. warnings.warn( You have passed task=transcribe, but also have set forced_decoder_ids to [[1, None], [2, 50359]] which creates a conflict. forced_decoder_ids will be ignored in favor of task=transcribe. C:\Users{username}\anaconda3\envs\f5\lib\site-packages\transformers\models\whisper\modeling_whisper.py:599: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:455.) attn_output = torch.nn.functional.scaled_dot_product_attention( Passing a tuple of past_key_values is deprecated and will be removed in Transformers v4.43.0. You should pass an instance of EncoderDecoderCache instead, e.g. past_key_values=EncoderDecoderCache.from_legacy_cache(past_key_values). The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results. Traceback (most recent call last): File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\gradio\queueing.py", line 536, in process_events response = await route_utils.call_process_api( File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\gradio\route_utils.py", line 322, in call_process_api output = await app.get_blocks().process_api( File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\gradio\blocks.py", line 1935, in process_api result = await self.call_function( File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\gradio\blocks.py", line 1520, in call_function prediction = await anyio.to_thread.run_sync( # type: ignore File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\anyio\to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\anyio_backends_asyncio.py", line 2441, in run_sync_in_worker_thread return await future File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\anyio_backends_asyncio.py", line 943, in run result = context.run(func, args) File "C:\Users{username}\anaconda3\envs\f5\lib\site-packages\gradio\utils.py", line 826, in wrapper response = f(args, **kwargs) File "E:\F5-TTS\gradio_app.py", line 66, in infer final_wave, final_sample_rate, combined_spectrogram = infer_process( File "E:\F5-TTS\model\utils_infer.py", line 214, in infer_process return infer_batch_process( File "E:\F5-TTS\model\utils_infer.py", line 307, in infer_batch_process final_wave = generated_waves[0] IndexError: list index out of range "
btw "generated_waves" is empty if it's important. I know it because, on my very first attempt, I was trying to use GPT, and it suggested printing "generated_waves" to check if it's really empty, so I just added a print line before the error and yes it was empty.