Open Ijustakid opened 2 months ago
Which container are you using to run the demo? It appears that the error is occurring in the ONNX export process, and it's possible this happens due to the package versions in your environment as I'm seeing some warnings for packages that I have not seen before with the demo (e.g. tensorflow warning about enabling oneDNN custom operations).
Could you try running with latest TRT version with the container suggested in the demo README, and let us know if you are still running into an issue? If there is some customization you are trying to apply, please share additional information so we can reproduce the issue.
maybe you could check the versions of tokenizers and transformers. follow README
Description
When I use your demo/Diffusion/demo_txt2img_xl.py for INT8 datatype inference, it reports an error:
Invoked with: %338 : Tensor = onnx::Constant(), scope: transformers.models.clip.modeling_clip.CLIPTextModel::/transformers.models.clip.modeling_clip.CLIPTextTransformer::text_model/transformers.models.clip.modeling_clip.CLIPEncoder::encoder/transformers.models.clip.modeling_clip.CLIPEncoderLayer::layers.0/transformers.models.clip.modeling_clip.CLIPSdpaAttention::self_attn , 'value', 0.125 (Occurred when translating scaled_dot_product_attention).
Environment
TensorRT Version: 10.2
NVIDIA GPU: A100
NVIDIA Driver Version: 535.161.08
CUDA Version: 12.2
CUDNN Version: --
Operating System: ubuntu 20.04
Python Version (if applicable): 3.8.19
Tensorflow Version (if applicable): 2.12.0
PyTorch Version (if applicable): 2.3.1
Baremetal or Container (if so, version):
Relevant Files
cmd: python3 demo_txt2img_xl.py "a photo of an astronaut riding a horse on mars" --version xl-1.0 --onnx-dir onnx-sdxl --engine-dir engine-sdxl --int
2024-08-19 17:19:34.445825: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable
demo.loadEngines(
File "demo_txt2img_xl.py", line 59, in loadEngines
self.base.loadEngines(engine_dir, framework_model_dir, onnx_dir, kwargs)
File "/sfs_cv/yhy3/project/tensorrt/TensorRT/demo/Diffusion/stable_diffusion_pipeline.py", line 457, in loadEngines
obj.export_onnx(onnx_path[model_name], onnx_opt_path[model_name], onnx_opset, opt_image_height, opt_image_width, enable_lora_merge=do_lora_merge[model_name], static_shape=static_shape)
File "/sfs_cv/yhy3/project/tensorrt/TensorRT/demo/Diffusion/models.py", line 413, in export_onnx
export_onnx(self.get_model())
File "/sfs_cv/yhy3/project/tensorrt/TensorRT/demo/Diffusion/models.py", line 398, in export_onnx
torch.onnx.export(model,
File "/sfs_cv/yhy3/conda/envs/trt_sd/lib/python3.8/site-packages/torch/onnx/utils.py", line 516, in export
_export(
File "/sfs_cv/yhy3/conda/envs/trt_sd/lib/python3.8/site-packages/torch/onnx/utils.py", line 1612, in _export
graph, params_dict, torch_out = _model_to_graph(
File "/sfs_cv/yhy3/conda/envs/trt_sd/lib/python3.8/site-packages/torch/onnx/utils.py", line 1138, in _model_to_graph
graph = _optimize_graph(
File "/sfs_cv/yhy3/conda/envs/trt_sd/lib/python3.8/site-packages/torch/onnx/utils.py", line 677, in _optimize_graph
graph = _C._jit_pass_onnx(graph, operator_export_type)
File "/sfs_cv/yhy3/conda/envs/trt_sd/lib/python3.8/site-packages/torch/onnx/utils.py", line 1956, in _run_symbolic_function
return symbolic_fn(graph_context, *inputs, *attrs)
File "/sfs_cv/yhy3/conda/envs/trt_sd/lib/python3.8/site-packages/torch/onnx/symbolic_helper.py", line 306, in wrapper
return fn(g, args, kwargs)
File "/sfs_cv/yhy3/conda/envs/trt_sd/lib/python3.8/site-packages/torch/onnx/symbolic_opset14.py", line 176, in scaled_dot_product_attention
query_scaled = g.op("Mul", query, g.op("Sqrt", scale))
File "/sfs_cv/yhy3/conda/envs/trt_sd/lib/python3.8/site-packages/torch/onnx/_internal/jit_utils.py", line 87, in op
return _add_op(self, opname, *raw_args, outputs=outputs, **kwargs)
File "/sfs_cv/yhy3/conda/envs/trt_sd/lib/python3.8/site-packages/torch/onnx/_internal/jit_utils.py", line 238, in _add_op
inputs = [_const_if_tensor(graph_context, arg) for arg in args]
File "/sfs_cv/yhy3/conda/envs/trt_sd/lib/python3.8/site-packages/torch/onnx/_internal/jit_utils.py", line 238, in
inputs = [_const_if_tensor(graph_context, arg) for arg in args]
File "/sfs_cv/yhy3/conda/envs/trt_sd/lib/python3.8/site-packages/torch/onnx/_internal/jit_utils.py", line 269, in _const_if_tensor
return _add_op(graph_context, "onnx::Constant", value_z=arg)
File "/sfs_cv/yhy3/conda/envs/trt_sd/lib/python3.8/site-packages/torch/onnx/_internal/jit_utils.py", line 246, in _add_op
node = _create_node(
File "/sfs_cv/yhy3/conda/envs/trt_sd/lib/python3.8/site-packages/torch/onnx/_internal/jit_utils.py", line 305, in _create_node
_add_attribute(node, key, value, aten=aten)
File "/sfs_cv/yhy3/conda/envs/trt_sd/lib/python3.8/site-packages/torch/onnx/_internal/jit_utils.py", line 356, in _addattribute
return getattr(node, f"{kind}")(name, value)
TypeError: z_(): incompatible function arguments. The following argument types are supported:
TF_ENABLE_ONEDNN_OPTS=0
. 2024-08-19 17:19:35.351517: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-08-19 17:19:40.233890: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT [I] Initializing TensorRT accelerated StableDiffusionXL txt2img pipeline [I] Autoselected scheduler: Euler [I] Load CLIPTokenizer model from: pytorch_model/xl-1.0/XL_BASE/tokenizer [I] Load CLIPTokenizer model from: pytorch_model/xl-1.0/XL_BASE/tokenizer_2 [I] Exporting ONNX model: onnx-sdxl/clip/model.onnx [I] Load CLIPTextModel model from: pytorch_model/xl-1.0/XL_BASE/text_encoder /sfs_cv/yhy3/conda/envs/trt_sd/lib/python3.8/site-packages/torch/onnx/utils.py:1547: OnnxExporterWarning: Exporting to ONNX opset version 19 is not supported. by 'torch.onnx.export()'. The highest opset version supported is 17. To use a newer opset version, consider 'torch.onnx.dynamo_export()'. Note that dynamo_export() is in preview. Please report errors with dynamo_export() as Github issues to https://github.com/pytorch/pytorch/issues. warnings.warn( /sfs_cv/yhy3/conda/envs/trt_sd/lib/python3.8/site-packages/transformers/modeling_attn_mask_utils.py:86: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if input_shape[-1] > 1 or self.sliding_window is not None: /sfs_cv/yhy3/conda/envs/trt_sd/lib/python3.8/site-packages/transformers/modeling_attn_mask_utils.py:162: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if past_key_values_length > 0: Traceback (most recent call last): File "demo_txt2img_xl.py", line 135, inInvoked with: %338 : Tensor = onnx::Constant(), scope: transformers.models.clip.modeling_clip.CLIPTextModel::/transformers.models.clip.modeling_clip.CLIPTextTransformer::text_model/transformers.models.clip.modeling_clip.CLIPEncoder::encoder/transformers.models.clip.modeling_clip.CLIPEncoderLayer::layers.0/transformers.models.clip.modeling_clip.CLIPSdpaAttention::self_attn , 'value', 0.125 (Occurred when translating scaled_dot_product_attention).
Please help me check it out. Thanks.