Open escorciav opened 1 year ago
After trial and error with the following file just before raising the error due to opset
I found out that the minimal opset to export T5 without error is 12 not 13 :). I'll post the issue in SNPE & AIMET such that both communities hammer at this issue together :mechanical_arm: . In the past, I usually had to refactor (rewrite) the modules such that they are compatible with previous opset.
if 't5' in model.__class__.__name__.lower() and opset == 9:
print('opset 9 did not work')
opset = onnx_config.DEFAULT_ONNX_OPSET = 12
if opset < onnx_config.DEFAULT_ONNX_OPSET:
raise ValueError(
f"Opset {opset} is not sufficient to export {model.config.model_type}. "
f"At least {onnx_config.DEFAULT_ONNX_OPSET} is required."
)
Hi @michaelbenayoun , do you have any suggestions on how to provide support for older opset versions of onnx?
I believe you can reproduce the error with the following command
optimum-cli export onnx --task text2text-generation-with-past --model t5-base checkpoints/t5-base_onnx/ --opset 9 --framework pt --optimize O3 --batch_size 1 --sequence_length 512 --atol 0.0001 --cache_dir ~/.cache/huggingface/hub
You should get an error like this
[HACK @ Victor Escorcia]. opset 9 is not supported 2023-06-08. By trial & error optset=12 works
/apps/bin/miniconda3/envs/on-device-llm/lib/python3.10/site-packages/transformers/models/t5/tokenization_t5_fast.py:155: FutureWarning: This tokenizer was incorrectly instantiated with a model max length of 512 which will be corrected in Transformers v5.
For now, this behavior is kept to avoid breaking backwards compatibility when padding/encoding with `truncation is True`.
- Be aware that you SHOULD NOT rely on t5-base automatically truncating your input to 512 when padding/encoding.
- If you want to encode/pad to sequences longer than 512 you can either instantiate this tokenizer with `model_max_length` or pass `max_length` when encoding/padding.
- To avoid this warning, please instantiate this tokenizer with `model_max_length` set to your preferred value.
warnings.warn(
Using framework PyTorch: 2.0.1+cu117
Overriding 1 configuration item(s)
- use_cache -> False
/apps/bin/miniconda3/envs/on-device-llm/lib/python3.10/site-packages/torch/onnx/utils.py:1636: UserWarning: The exported ONNX model failed ONNX shape inference.The model will not be executable by the ONNX Runtime.If this is unintended and you believe there is a bug,please report an issue at https://github.com/pytorch/pytorch/issues.Error reported by strict ONNX shape inference: [ShapeInferenceError] (op_type:Min, node name: /block.0/layer.0/SelfAttention/Min): data_0 typestr: T, has unsupported type: tensor(int64) (Triggered internally at ../torch/csrc/jit/serialization/export.cpp:1407.)
_C._check_onnx_proto(proto)
============= Diagnostic Run torch.onnx.export version 2.0.1+cu117 =============
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================
Traceback (most recent call last):
File "/apps/bin/miniconda3/envs/on-device-llm/bin/optimum-cli", line 8, in <module>
sys.exit(main())
File "/apps/bin/miniconda3/envs/on-device-llm/lib/python3.10/site-packages/optimum/commands/optimum_cli.py", line 163, in main
service.run()
File "/apps/bin/miniconda3/envs/on-device-llm/lib/python3.10/site-packages/optimum/commands/export/onnx.py", line 208, in run
main_export(
File "/apps/bin/miniconda3/envs/on-device-llm/lib/python3.10/site-packages/optimum/exporters/onnx/__main__.py", line 303, in main_export
_, onnx_outputs = export_models(
File "/apps/bin/miniconda3/envs/on-device-llm/lib/python3.10/site-packages/optimum/exporters/onnx/convert.py", line 609, in export_models
export(
File "/apps/bin/miniconda3/envs/on-device-llm/lib/python3.10/site-packages/optimum/exporters/onnx/convert.py", line 714, in export
config.fix_dynamic_axes(output, device=device, input_shapes=input_shapes, dtype=dtype)
File "/apps/bin/miniconda3/envs/on-device-llm/lib/python3.10/site-packages/optimum/exporters/onnx/base.py", line 255, in fix_dynamic_axes
session = InferenceSession(model_path.as_posix(), providers=providers, sess_options=session_options)
File "/apps/bin/miniconda3/envs/on-device-llm/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 383, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/apps/bin/miniconda3/envs/on-device-llm/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 424, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from checkpoints/t5-base_onnx/encoder_model.onnx failed:This is an invalid model. Type Error: Type 'tensor(int64)' of input parameter (/block.0/layer.0/SelfAttention/Add_1_output_0) of operator (Min) in node (/block.0/layer.0/SelfAttention/Min) is invalid.
As I suspect the Traceback isn't that useful. I believe the error goes all the way down to an incorrect ONNX model export which happens in export_pytorch
After enabling the verbose export of onnx,
Hi @escorciav ,
To make this happen you will need to:
The PR in Transformers is unlikely to be merged if it involves too many changes in the modeling code.
One solution for you would be to create your own OnnxConfig
as you suggested. This way you can export whatever you want without needing anyone to validate anything.
Thanks a lot for the pointers!
Everything makes sense. I might keep reporting progress here for my own & everyone's benefit if that's OK.
OnnxConfig
to enable opset 9 for the export of T5I have been hacking optimum
onnx exporter & it seems that it's possible to export as ONNX with --opset 9
.
I managed to export T5-base to onnx & convert those onto dlc (format used by SNPE). I used snpe-2.10.0.4541
.
If needed, the tricks are:
Disable the dynamic axes biz in optimum/exporters/onnx/covert.py:export_pytorch
disable_dynamic_axes = True # (Default: must be False)
extra_kwargs = dict(
export_params=True
# training=torch.onnx.TrainingMode.EVAL,
# verbose=True
)
if not disable_dynamic_axes:
extra_kwargs['dynamic_axes'] = dict(chain(inputs.items(), config.outputs.items())),
else:
print(f'[HACK] {disable_dynamic_axes=}')
print(dict(chain(inputs.items(), config.outputs.items())))
onnx_export(
model,
(dummy_inputs,),
f=output.as_posix(),
input_names=input_names,
output_names=output_names,
do_constant_folding=True,
opset_version=opset,
**extra_kwargs,
)
Drop off the post-processing --no-post-process
You may face other errors, but perhaps irrelevant :clinking_glasses:
@michaelbenayoun @fxmarty I feel that I could write the dynamic_axes
biz as a custom OnnxConfig, but unclear how to do it. Any hit is highly appreciated :). Unfortunatly, my company is blacklisted of HF server, and we can't access the documentation or any HF webpage since Monday :sweat:
FWII I managed to export a HF:transformers:LlamaModel
to onnx with opset version 9.
if input_shape[-1] > 1:
). AFAIU onnx discourages those.forward
if the transformer package is picky with the PRs :blush: torch.jit.script
but currently it is not done.optimum
, but I would suggest to simply write your custom OnnxConfig
for now. Hi @escorciav, do you mind sharing how you were able to export the T5/Llama models to opset 9? Even after modifying convert.py
, I'm still getting the error when I try exporting with --opset 9
.
Hi @escorciav, thanks for the reply! And sorry to bother you on your holiday!
I'm able to export to ONNX--the issue I'm facing is with exporting that to SNPE.
To convert, I'm trying to use snpe-onnx-to-dlc -i decoder_model.onnx
, but I get an error:
Node SimplifiedLayerNormalization: 'No translation registered for op type onnx_simplifiedlayernormalization'
Did you encounter this? And if so, how did you work around it? I believe Qualcomm doesn't currently support that layer. Thanks!
FWIU from my previous messages 😊😅, I didn't. It's possible that I only exported the encoder to DLC. BTW, I stopped using T5 due to unrelated issues to this thread.
Are you using the latest version of SNPE? I suggest using the latest as well as the compatible version of Pytorch, onnx, onnx-simplifier, etc. I have the hunch that issues get solved over time :)
@andyxzhu apparently, I did get a dlc for the decoder. Thus, let's assume that I didn't get such an error. Happy to share the DLC for you to reverse engineer differences with respect to your model.
ATM, I'm not using SNPE. Thus, I won't be able to guide you further. FWII, I'm using Qualcomm QNN
$ tree checkpoints/t5-base_onnx
checkpoints/t5-base_onnx
├── config.json
├── decoder_model.dlc
├── decoder_model.onnx
├── decoder_with_past_model.onnx
├── encoder_model.dlc
├── encoder_model.onnx
├── generation_config.json
├── log.txt
├── ort_config.json
├── special_tokens_map.json
├── spiece.model
├── tokenizer_config.json
└── tokenizer.json
Hi @escorciav, I've gotten T5 successfully exported to ONNX! In case someone else wants to do so as well, here are the additional steps I followed:
optimum/exporters/onnx/__main__.py
remove the check for opset numberHowever, I'm still facing some issues with exporting Llama. You mentioned being able to do this here; could you share some more details?
Thanks!
Feature request
Export to onnx fails for opset 9 with T5
Motivation
ONNX opset 9 is required by SNPE, Qualcomm SDK accelerator. By supporting ONNX opset 9, we will unleash ML on the edge & on mobile phones.
Your contribution
I m willing to give a hand, but help is needed as I'm not familiar with all the abstractions.
Details
Environmet details
Some of the relevant dependencies