optimum==1.19.0.dev0
torch==2.1.2
onnx==1.16.0
onnxruntime==1.18.0
cuda==11.8
optimum from mht-sharma:add_llava
Who can help?
@mht-sharma @xenova
Information
[X] The official example scripts
[ ] My own modified scripts
Tasks
[ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[X] My own task or dataset (give details below)
Reproduction (minimal, reproducible, runnable)
code: optimum-cli export onnx --model llava-hf/llava-1.5-7b-hf llava_onnx/ --task image-to-text-with-past --trust-remote-code
error:
Traceback (most recent call last):
File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/optimum/exporters/onnx/convert.py", line 577, in export_pytorch
onnx_export(
File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/onnx/utils.py", line 516, in export
_export(
File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/onnx/utils.py", line 1596, in _export
graph, params_dict, torch_out = _model_to_graph(
File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/onnx/utils.py", line 1135, in _model_to_graph
graph, params, torch_out, module = _create_jit_graph(model, args)
File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/onnx/utils.py", line 1011, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/onnx/utils.py", line 915, in _trace_and_get_graph_from_model
trace_graph, torch_out, inputs_states = torch.jit._get_trace_graph(
File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/jit/_trace.py", line 1285, in _get_trace_graph
outs = ONNXTracedModule(
File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, kwargs)
File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, kwargs)
File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/jit/_trace.py", line 133, in forward
graph, out = torch._C._create_graph_by_tracing(
File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/jit/_trace.py", line 124, in wrapper
outs.append(self.inner(trace_inputs))
File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(args, kwargs)
File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, kwargs)
File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1508, in _slow_forward
result = self.forward(*input, kwargs)
File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/optimum/exporters/onnx/model_patcher.py", line 589, in patched_forward
outputs = model.language_model(
File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, *kwargs)
File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(args, kwargs)
File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1183, in forward
outputs = self.model(
File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, *kwargs)
File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(args, **kwargs)
File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1035, in forward
attention_mask = _prepare_4d_causal_attention_mask_for_sdpa(
File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/transformers/modeling_attn_mask_utils.py", line 398, in _prepare_4d_causal_attention_mask_for_sdpa
expanded_4d_mask = attn_mask_converter.to_4d(
File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/transformers/modeling_attn_mask_utils.py", line 137, in to_4d
expanded_attn_mask = causal_4d_mask.masked_fill(expanded_attn_mask.bool(), torch.finfo(dtype).min)
RuntimeError: The size of tensor a (4112) must match the size of tensor b (32) at non-singleton dimension 3
python-BaseException
Expected behavior
We hope to solve this problem and let llava successfully transfer to onnx.
System Info
Who can help?
@mht-sharma @xenova
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction (minimal, reproducible, runnable)
code: optimum-cli export onnx --model llava-hf/llava-1.5-7b-hf llava_onnx/ --task image-to-text-with-past --trust-remote-code error: Traceback (most recent call last): File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/optimum/exporters/onnx/convert.py", line 577, in export_pytorch onnx_export( File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/onnx/utils.py", line 516, in export _export( File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/onnx/utils.py", line 1596, in _export graph, params_dict, torch_out = _model_to_graph( File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/onnx/utils.py", line 1135, in _model_to_graph graph, params, torch_out, module = _create_jit_graph(model, args) File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/onnx/utils.py", line 1011, in _create_jit_graph graph, torch_out = _trace_and_get_graph_from_model(model, args) File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/onnx/utils.py", line 915, in _trace_and_get_graph_from_model trace_graph, torch_out, inputs_states = torch.jit._get_trace_graph( File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/jit/_trace.py", line 1285, in _get_trace_graph outs = ONNXTracedModule( File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/jit/_trace.py", line 133, in forward graph, out = torch._C._create_graph_by_tracing( File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/jit/_trace.py", line 124, in wrapper outs.append(self.inner(trace_inputs)) File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1508, in _slow_forward result = self.forward(*input, kwargs) File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/optimum/exporters/onnx/model_patcher.py", line 589, in patched_forward outputs = model.language_model( File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(args, kwargs) File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1183, in forward outputs = self.model( File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(args, **kwargs) File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1035, in forward attention_mask = _prepare_4d_causal_attention_mask_for_sdpa( File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/transformers/modeling_attn_mask_utils.py", line 398, in _prepare_4d_causal_attention_mask_for_sdpa expanded_4d_mask = attn_mask_converter.to_4d( File "/home/user/anaconda3/envs/llava/lib/python3.10/site-packages/transformers/modeling_attn_mask_utils.py", line 137, in to_4d expanded_attn_mask = causal_4d_mask.masked_fill(expanded_attn_mask.bool(), torch.finfo(dtype).min) RuntimeError: The size of tensor a (4112) must match the size of tensor b (32) at non-singleton dimension 3 python-BaseException
Expected behavior
We hope to solve this problem and let llava successfully transfer to onnx.