Open scuizhibin opened 2 days ago
Hi @scuizhibin , yes, there is currently some compatibility issues between HuggingFace and torch.onnx, due to recent HF 4.45 changes. The solution might be manually switching the attention implemention to eager, https://github.com/huggingface/transformers/blob/v4.45.1/src/transformers/modeling_utils.py#L3105-L3106
We're investigating to see if there is workaround we can do to resolve this from TRT-LLM side
system-info : GPU:3090 GpuDRIVER:550.107.02 Ubuntu :22.04
Using multimodal examples,python build_visual_engine.py --model_type phi-3-vision --model_path tmp/hf_models/${MODEL_NAME}
error :UnsupportedOperatorError: ONNX export failed on an operator with unrecognized namespace flash_attn::_flash_attn_forward. If you are trying to export a custom operator, make sure you registered it with the right domain and version.