Export to ONNX Format - Githubissues

DakeQQ commented 1 month ago

如果有人对F5_TTS-ONNX模型感兴趣，可以参考这链结: )。
If anyone is interested in the F5_TTS-ONNX model, they can refer to this link : ).
Export_ONNX

SWivid commented 1 month ago

@DakeQQ 非常感谢！我们也有尝试过Int8，也发现慢了。希望有能人志士帮忙看看~

Thanks a lot! We also tried Int8 and found it slow. Hope someone help with it ~

GreenLandisaLie commented 1 month ago

Thank you!

Its working and I can even use onnxruntime-directml (package) to run this on my AMD GPU! For that - the provider of ort_session_A and ort_session_C needs to be forced to ['CPUExecutionProvider'] but ort_session_B can use ['DmlExecutionProvider', 'CPUExecutionProvider'] and its blazing fast vs CPU. Funny this works yet I cannot get torch_directml to work with the base .safetensors model (in gradio_app.py) no matter what I tried.

I'm facing a problem though - the ouputs are always in chinese... What do I need to change in 'Export_F5.py' to make this work for english?

DakeQQ commented 1 month ago

Thank you for your testing. However, the setup for the English version may need to be answered by the original author of the F5-TTS project. The code for ONNX export and execution is based on the original work.

According to my tests, ort_session_A and ort_session_C together take up less than 1% of the time cost, while ort_session_B occupies the majority of the time.

GreenLandisaLie commented 1 month ago

According to my tests, ort_session_A and ort_session_C together take up less than 1% of the time cost, while ort_session_B occupies the majority of the time.

Yes and is why inference speed is pretty much not affected by setting those to CPU. ort_session_B is what matters and it runs fine on AMD GPUs using onnxruntime-directml!

Anyways, I've tried messing around with vocab and ofc the reference audio and text but the speaker always tries to speak chinese - even when ref text+audio and gen_text are in english. May be worth noting this has nothing to do with the fact I'm using directml because it also happened before I even tried that.

Looking forward to get this working on English... @SWivid please check this out when you have time. Thanks once again!

DakeQQ commented 1 month ago

Hello~ The issue with the English voice should have been resolved. Please try again using the latest F5-TTS-ONNX version. @GreenLandisaLie

GreenLandisaLie commented 1 month ago

Its working now both in chinese and english! Thanks!

@SWivid Maybe its worth adding a 'ONNX' branch at https://huggingface.co/SWivid/F5-TTS/tree/main.

SWivid commented 1 month ago

@GreenLandisaLie Yes, the onnx version is great!

Maybe better for @DakeQQ to do that? and we will also add link to that onnx repo (currently credit and link to F5-TTS-ONNX repo).

eschmidbauer commented 1 month ago

can someone share the onnx export ? i would love to try it out! Thanks

KungFuFurniture commented 4 weeks ago

If anyone would be willing to run me through how to do this and get it working on my Win10 5700xt I would be eternally greatful. (well at least until the next TTS upgrade comes out.)

eschmidbauer commented 4 weeks ago

@KungFuFurniture see this repo i haven't tried in a few days but seems there has been some updates

KungFuFurniture commented 4 weeks ago

Yes I saw that, cloned the repo, changed some path directories in the export.py... But now I'm lost. I am really new to all this (maybe a year or so) so I am not 100% on what I am getting wrong.

Traceback (most recent call last):
  File "D:\Games\F5\F5-TTS1\src\f5_tts\export_f5.py", line 316, in <module>
    torch.onnx.export(
  File "D:\Games\F5\F5-TTS\env\lib\site-packages\torch\onnx\utils.py", line 551, in export
    _export(
  File "D:\Games\F5\F5-TTS\env\lib\site-packages\torch\onnx\utils.py", line 1648, in _export
    graph, params_dict, torch_out = _model_to_graph(
  File "D:\Games\F5\F5-TTS\env\lib\site-packages\torch\onnx\utils.py", line 1170, in _model_to_graph
    graph, params, torch_out, module = _create_jit_graph(model, args)
  File "D:\Games\F5\F5-TTS\env\lib\site-packages\torch\onnx\utils.py", line 1046, in _create_jit_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args)
  File "D:\Games\F5\F5-TTS\env\lib\site-packages\torch\onnx\utils.py", line 950, in _trace_and_get_graph_from_model
    trace_graph, torch_out, inputs_states = torch.jit._get_trace_graph(
  File "D:\Games\F5\F5-TTS\env\lib\site-packages\torch\jit\_trace.py", line 1497, in _get_trace_graph
    outs = ONNXTracedModule(
  File "D:\Games\F5\F5-TTS\env\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Games\F5\F5-TTS\env\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Games\F5\F5-TTS\env\lib\site-packages\torch\jit\_trace.py", line 141, in forward
    graph, out = torch._C._create_graph_by_tracing(
  File "D:\Games\F5\F5-TTS\env\lib\site-packages\torch\jit\_trace.py", line 132, in wrapper
    outs.append(self.inner(*trace_inputs))
  File "D:\Games\F5\F5-TTS\env\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Games\F5\F5-TTS\env\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Games\F5\F5-TTS\env\lib\site-packages\torch\nn\modules\module.py", line 1543, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "D:\Games\F5\F5-TTS1\src\f5_tts\export_f5.py", line 154, in forward
    pred = self.f5_transformer(x=noise, cond=cat_mel_text, cond_drop=cat_mel_text_drop, time=self.time_expand[:, time_step], rope_cos=rope_cos, rope_sin=rope_sin, qk_rotated_empty=qk_rotated_empty)
  File "D:\Games\F5\F5-TTS\env\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Games\F5\F5-TTS\env\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Games\F5\F5-TTS\env\lib\site-packages\torch\nn\modules\module.py", line 1543, in _slow_forward
    result = self.forward(*input, **kwargs)
TypeError: DiT.forward() got an unexpected keyword argument 'cond_drop'

This is my error message.

DakeQQ commented 4 weeks ago

The wrong message "DiT.forward() got an unexpected keyword argument 'cond_drop'", shows that the export process used the original code doing the export process.

First, we use shutil.copyfile (Export_F5.py at line 77-82) to replace the original code with the modified version. Ensure that the modified Python scripts are stored in the 'modeling_modified' folder.

shutil.copyfile(modified_path + 'vocos/heads.py', python_package_path + '/vocos/heads.py')
shutil.copyfile(modified_path + 'vocos/models.py', python_package_path + '/vocos/models.py')
shutil.copyfile(modified_path + 'vocos/modules.py', python_package_path + '/vocos/modules.py')
shutil.copyfile(modified_path + 'vocos/pretrained.py', python_package_path + '/vocos/pretrained.py')
shutil.copyfile(modified_path + 'F5/modules.py', F5_project_path + '/model/modules.py')
shutil.copyfile(modified_path + 'F5/dit.py', F5_project_path + '/model/backbones/dit.py')

（We may have accidentally deleted some code. Please fetch the latest code and try again.）

KungFuFurniture commented 3 weeks ago

The wrong message "DiT.forward() got an unexpected keyword argument 'cond_drop'", shows that the export process used the original code doing the export process.

So I did a complete startover. Grabbed fresh F5, fresh venv, grabbed the link above, changed file locations from user Dake... It seems my file structure and some names are a bit different, and I believe that is getting me in some trouble. For example:

from src.f5_tts.model import CFM, DiT
from src.f5_tts.infer.utils_infer import load_checkpoint

load_checkpoints is in utils_infer not models.utils in my version of the f5 repo. But I believe I have found most of those things. Now I am stuck here:

Traceback (most recent call last):
  File "D:\Games\TTS\F5-TTS\export_f5.py", line 14, in <module>
    from src.f5_tts.infer.utils_infer import load_checkpoint
  File "D:\Games\TTS\F5-TTS\src\f5_tts\infer\utils_infer.py", line 32, in <module>
    vocos = Vocos.from_pretrained("charactr/vocos-mel-24khz")
  File "D:\Games\TTS\F5-TTS\env\lib\site-packages\vocos\pretrained.py", line 69, in from_pretrained
    model = cls.from_hparams(config_path)
  File "D:\Games\TTS\F5-TTS\env\lib\site-packages\vocos\pretrained.py", line 54, in from_hparams
    with open(config_path, "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'charactr/vocos-mel-24khz/config.yaml'

I mean I have the config and pytorch_model but I can't figure out where to put em. I have tried about 16 different folders from a cached huggingface folder to the aforementioned infer folder. I dunno. I don't know anything about vocos and its lil brick road is far from Yellow. I fell outta Kansas quick.

SWivid commented 3 weeks ago

replace vocos = Vocos.from_pretrained("charactr/vocos-mel-24khz") with

vocos = Vocos.from_hparams(f"{local_path}/config.yaml")
state_dict = torch.load(f"{local_path}/pytorch_model.bin", map_location=device)
vocos.load_state_dict(state_dict)
vocos.eval()

KungFuFurniture commented 3 weeks ago

Alright, making progress. Thank you for the help. After defining the local_path, I got the DiT uncond error again. Compared the 2 dit.py files they are the same. So it did copy. I ran it again... Got a different error.

Traceback (most recent call last):
  File "D:\Games\TTS\F5-TTS\export_f5.py", line 13, in <module>
    from src.f5_tts.model import CFM, DiT
  File "D:\Games\TTS\F5-TTS\src\f5_tts\model\__init__.py", line 1, in <module>
    from f5_tts.model.cfm import CFM
  File "D:\Games\TTS\F5-TTS\src\f5_tts\model\__init__.py", line 4, in <module>
    from f5_tts.model.backbones.dit import DiT
  File "D:\Games\TTS\F5-TTS\src\f5_tts\model\backbones\dit.py", line 16, in <module>
    from model.modules import (
ModuleNotFoundError: No module named 'model'

Which you can see in the path model is there and module is within and so are the functions we are after. So I added the following line to the dit.py, as I used that once in a different project to resolve a similar "can't find the module" issue.

sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '../..')))

That did not help...

Traceback (most recent call last):
  File "D:\Games\TTS\F5-TTS\export_f5.py", line 13, in <module>
    from src.f5_tts.model import CFM, DiT
  File "D:\Games\TTS\F5-TTS\src\f5_tts\model\__init__.py", line 1, in <module>
    from f5_tts.model.cfm import CFM
  File "D:\Games\TTS\F5-TTS\src\f5_tts\model\__init__.py", line 4, in <module>
    from f5_tts.model.backbones.dit import DiT
  File "D:\Games\TTS\F5-TTS\src\f5_tts\model\backbones\dit.py", line 16, in <module>
    from model.modules import (
  File "D:\Games\TTS\F5-TTS\src\f5_tts\model\__init__.py", line 4, in <module>
    from f5_tts.model.backbones.dit import DiT
ImportError: cannot import name 'DiT' from partially initialized module 'f5_tts.model.backbones.dit' (most likely due to a circular import) (D:\Games\TTS\F5-TTS\src\f5_tts\model\backbones\dit.py)

But hey new errors are progress right?

SWivid commented 3 weeks ago

error due to literally a circular import not with sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '../..'))) but from model.modules import ( to from f5_tts.model.modules import ( we have reorganize the repo making it compatible for pkg form, check the lastest version

KungFuFurniture commented 3 weeks ago

Git pulled, got an update... Same thing

(env) D:\Games\TTS\F5-TTS>python export_f5.py
Traceback (most recent call last):
  File "D:\Games\TTS\F5-TTS\export_f5.py", line 13, in <module>
    from src.f5_tts.model import CFM, DiT
  File "D:\Games\TTS\F5-TTS\src\f5_tts\model\__init__.py", line 1, in <module>
    from f5_tts.model.cfm import CFM
  File "D:\Games\TTS\F5-TTS\src\f5_tts\model\__init__.py", line 4, in <module>
    from f5_tts.model.backbones.dit import DiT
  File "D:\Games\TTS\F5-TTS\src\f5_tts\model\backbones\dit.py", line 15, in <module>
    from model.modules import (
ModuleNotFoundError: No module named 'model'

SWivid commented 3 weeks ago

The point is: when replacing the modified script for ONNX compatibility, e.g. Export_ONNX/F5_TTS/modeling_modified/F5/dit.py, need to keep an eye on the differences like https://github.com/DakeQQ/F5-TTS-ONNX/blob/259d6198b6e91d6911bbd1f1e3a5ca96c0d21711/Export_ONNX/F5_TTS/modeling_modified/F5/dit.py#L16 Just put together the two repos and take a while look into it, you'll get it

GreenLandisaLie commented 3 weeks ago

@KungFuFurniture You just need to replace the existing F5 repo files with the equivalent ones from the ONNX export repo and do the same for the VOCO package installation files as well (...\Lib\site-packages\vocos). Place Export_F5.py directly on the root F5 folder (where gradio_app.py is), activate F5 environment then run it. Once converted, replace the files you replaced with their original counterparts (do a new install if you must). I think you have most of this figured out by now.

Just want to add one important thing if you want to run this on a AMD GPU you might need to do this:

'pip uninstall onnxruntime' then 'pip install onnxruntime-directml' and change the inference code by setting ort_session_B's providers to ['DmlExecutionProvider', 'CPUExecutionProvider']. The inference code for ONNX is essentially the last part of the Export_F5.py script and if you want to run it with gradio just make a copy of the gradio_app.py file, add 'import onnxruntime' and 'import jieba' followed by all of the necessary changes that are a bit too many for me to list them all. But in essence, you just need to replace the original pytorch inference code with the onnx equivalent, remove spectrogram inputs and ouputs from gradio as well as its functions during inference and force load your onnx models while ignoring the other pytorch ones... that's pretty much it.

PS: this is how I did it a week ago but the Export_T5.py file has been changed many times since then and so this might no longer work. Additionally - at the time - the Export_T5.py file did not contain necessary audio transformations that allow for invalid format .wav reference audio files so I had to copy paste those from the original code. You might or might not need to do this as well. Good Luck :D Hopefully someone will release the converted .onnx models with a pipeline for it so it will be easy to use in the future.

DakeQQ commented 3 weeks ago

@KungFuFurniture We are very sorry for your poor experience. Due to the rapid updates of the original work, we were unable to update in time. Now, we have adapted and tested the export for the latest SWivid/F5-TTS. Please download the F5-TTS-ONNX export code again and try it once more.

Please note that, we use the modified load vocos method by the following code at line 52: shutil.copyfile(modified_path + '/vocos/pretrained.py', python_package_path + '/vocos/pretrained.py') If you can access the HuggingFace repository 'charactr/vocos-mel-24khz' directly, you can disable that line of code and re-install the vocos python package(it may be modified). Next, set the vocos_model_path = 'charactr/vocos-mel-24khz'

KungFuFurniture commented 3 weeks ago

First let me say to everyone, Thank you for the help. @DakeQQ Certainly not a poor experience, but a learning experience. I certainly appreciate the work you have done here. An effort is Awesome. So I made the Execution provider change to "B" as suggested. I got the Export.py to run successfully. I swapped back all the files it changed, both vocos, and f5 (modules, pretrained, etc.)
@GreenLandisaLie I have onnxruntime-directml (torch too). Gradio_app.py is no longer a thing, but there is an alternative. I am not sure that's where the change needs to be made any longer.

So here is where I am. Export seems to have worked, and I can still run the app, and it works. But it works exactly the same. Not using the GPU. (AMD 5700xt) That is I am sure a result of what Green mentioned in adjustments to app.py.

I feel like such a Kindergardner in College. I am so far in over my head gang. I learned Python from Youtube. lol. I know nothing about onnx - torch except that they help make the magic work.

So any suggestions on what to do next... ? Again all help is super appreciated. And I get it if you don't have time to educate me.

Cheers to all.

DakeQQ commented 3 weeks ago

@KungFuFurniture If you're a beginner, it's advisable to start with simpler models like YOLO-v9, which are well-suited for NPUs and GPUs due to their gpu-friendly architecture.

Begin by successfully invoking the GPU with a simple model, as image processing models are generally easier to handle.
Export the model with static input and output shapes by setting dynamic_axes=None. This increases the likelihood of successful GPU code building.
Quantize the exported ONNX model to Float16 format. It better for GPU compute.
Use optimization tools such as onnxsim (pip install onnxsim) to simplify the exported model.
You can visualize the model structure using the Netron tool. If all operator node input/output shapes are numeric, it indicates a high probability of successful GPU execution.

Additionally, set the ONNX Runtime log level to 0 or 1 with session_opts.log_severity_level = 0. This provides detailed error reports from ONNX, which can be used to seek help from ChatGPT. Following these error reports should help you resolve most issues.

amblamps commented 2 weeks ago

It looks like the repo has changed a lot since the last ONNX export attempt. I'm getting this error when trying to export to ONNX after replacing the modified vocos, and f5 files.

RuntimeError: Error(s) in loading state_dict for CFM:
    Missing key(s) in state_dict: "mel_spec.mel_stft.spectrogram.window", "mel_spec.mel_stft.mel_scale.fb".

Any ideas?

SWivid commented 2 weeks ago

@amblamps thought fixed by @DakeQQ , many thanks! Mainly for the change with 712d52772ef496b6cd191ba6197bac6e112fddd8 to 315230210d6698a6ce01da669c0fe4085accb693 at https://github.com/SWivid/F5-TTS/blob/4a69e6bad29dcb499e5cdec4104325f733eb485c/src/f5_tts/model/modules.py#L30-L143

DakeQQ commented 2 weeks ago

@amblamps You can disable the src/f5_tts/infer/utils_infer.py, line 164-166, directely. Or use the lastest exported code and try more once.

# for key in ["mel_spec.mel_stft.mel_scale.fb", "mel_spec.mel_stft.spectrogram.window"]:
#     if key in checkpoint["model_state_dict"]:
#         del checkpoint["model_state_dict"][key]

amblamps commented 2 weeks ago

@amblamps You can disable the src/f5_tts/infer/utils_infer.py, line 164-166, directely. Or use the lastest exported code and try more once.
# for key in ["mel_spec.mel_stft.mel_scale.fb", "mel_spec.mel_stft.spectrogram.window"]:
#     if key in checkpoint["model_state_dict"]:
#         del checkpoint["model_state_dict"][key]

Thanks! That worked.

eschmidbauer commented 2 weeks ago

has anyone shared a recent onnx export and code for inference?

amblamps commented 2 weeks ago

@DakeQQ Do any other modifications need to be made to the script to export the E2 TTS model aside from pointing it to the correct checkpoint?

DakeQQ commented 2 weeks ago

We have not yet attempted to export the E2-TTS model. If its function call path is the same as that of F5-TTS, theoretically, only modifying the model file path would be necessary to make the corresponding adjustments. However, the actual situation may be more complex, so we currently do not have specific plans to export E2-TTS in ONNX format.

smickovskid commented 1 week ago

There still seem to be issues with the mel params, has anyone been able to export recently ?

DakeQQ commented 1 week ago

@smickovskid What mel parameter issues are you encountering？ Could the STFT_Process.py script resolve them?

smickovskid commented 1 week ago

Getting the same issue that @amblamps encountered

Traceback (most recent call last):
  File "F5-TTS-ONNX/Export_ONNX/F5_TTS/Export_F5.py", line 273, in <module>
    f5_model = load_model(F5_safetensors_path)
  File "F5-TTS-ONNX/Export_ONNX/F5_TTS/Export_F5.py", line 202, in load_model
    return load_checkpoint(model, ckpt_path, 'cpu', use_ema=True)
  File "F5-TTS/src/f5_tts/infer/utils_infer.py", line 168, in load_checkpoint
    model.load_state_dict(checkpoint["model_state_dict"])
  File "/miniconda3/envs/f5-tts/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2189, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for CFM:
    Missing key(s) in state_dict: "mel_spec.mel_stft.spectrogram.window", "mel_spec.mel_stft.mel_scale.fb".

I am using a custom fine tuned model, I also ran STFT_Process.py but still getting the same.

This is my Export_F5.py config

F5_project_path      = "/home/smickovskid/ai/F5-TTS"                               # The F5-TTS Github project download path.  URL: https://github.com/SWivid/F5-TTS
F5_safetensors_path  = "/home/smickovskid/ai/F5-TTS/ckpts/ClapTrap/model_last.pt"                 # The F5-TTS model download path.           URL: https://huggingface.co/SWivid/F5-TTS/tree/main/F5TTS_Base
vocos_model_path     = "/home/smickovskid/ai/F5-TTS-ONNX/vocos"                                     # The Vocos model download path.            URL: https://huggingface.co/charactr/vocos-mel-24khz/tree/main
onnx_model_A         = "/home/smickovskid/ai/F5-TTS-ONNX/F5_Preprocess.onnx"                # The exported onnx model path.
onnx_model_B         = "/home/smickovskid/ai/F5-TTS-ONNX/F5_Transformer.onnx"               # The exported onnx model path.
onnx_model_C         = "/home/smickovskid/ai/F5-TTS-ONNX/F5_Decode.onnx"                    # The exported onnx model path.
python_package_path  = '/home/smickovskid/miniconda3/envs/f5-tts/lib/python3.10/site-packages'  # The Python package path.
modified_path        = '/home/smickovskid/ai/F5-TTS-ONNX/Export_ONNX/F5_TTS/modeling_modified'

reference_audio      = "/home/smickovskid/ai/F5-TTS/ckpts/ClapTrap/samples/step_20000_ref.wav"   # The reference audio path.
generated_audio      = "/home/smickovskid/ai/F5-TTS/ckpts/ClapTrap/samples/step_20000_gen.wav"      # The generated audio path.
ref_text             = "Sanctuary. This Glacier's full of nothing but murderers or jerkbags, like that Hammerlock dude. Minion! I've got my eyesight back, and you're far uglier than I remembered. Anyway, it's time to get to the Resistance in Sanctuary!"
gen_text             = "Sanctuary. This Glacier's full of nothing but murderers or jerkbags, like that Hammerlock dude. Minion! I've got my eyesight back, and you're far uglier than I remembered. Anyway, it's time to get to the Resistance in Sanctuary!"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

I am running python Export_ONNX/F5_TTS/Export_F5.py as the command

Edit:

I've changed

model.load_state_dict(checkpoint["model_state_dict"], strict=False)

and it passes now but it fails down the line with

2024-11-17 03:26:34.026251302 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running Mul node. Name:'/f5_transformer/transformer_blocks.0/attn/Mul_15' Status Message: /onnxruntime_src/onnxruntime/core/providers/cpu/math/element_wise_ops.h:560 void onnxruntime::BroadcastIterator::Append(ptrdiff_t, ptrdiff_t) axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 2048 by 2814

Traceback (most recent call last):
  File "/home/smickovskid/ai/F5-TTS-ONNX/Export_ONNX/F5_TTS/Export_F5.py", line 467, in <module>
    noise = ort_session_B.run(
  File "/home/smickovskid/miniconda3/envs/f5-tts/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 266, in run
    return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Mul node. Name:'/f5_transformer/transformer_blocks.0/attn/Mul_15' Status Message: /onnxruntime_src/onnxruntime/core/providers/cpu/math/element_wise_ops.h:560 void onnxruntime::BroadcastIterator::Append(ptrdiff_t, ptrdiff_t) axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 2048 by 2814

DakeQQ commented 1 week ago

@smickovskid Apologies for the delayed response. The main issue is that your audio input exceeds the maximum length defined in the exported ONNX model settings. Specifically, MAX_SIGNAL_LENGTH = 2048 (set at line 68 in Export_F5.py), while your audio, after the STFT process, has a length of 2814. Please re-export all ONNX models with an appropriately larger value for MAX_SIGNAL_LENGTH.

smickovskid commented 1 week ago

Hey @DakeQQ, sorry for the late response. Yeah that fixed it! Thanks for all the help.

SWivid / F5-TTS

Export to ONNX Format #214