Open justinchuby opened 5 months ago
_ ._ __/__ _ _ _ _ _/_ Recorded: 07:19:45 Samples: 11044
/_//_/// /_\ / //_// / //_'/ // Duration: 11.784 CPU time: 13.683
/ _/ v4.6.2
Program: /Users/justinc/Documents/GitHub/torch-onnx/venv/bin/optimum-cli export onnx --model openai/whisper-large-v3 whisper/ --no-post-process
11.784 export_pytorch optimum/exporters/onnx/convert.py:485
└─ 11.784 export optimum/exporters/onnx/convert.py:584
└─ 11.783 _torch_onnx_export torch_onnx/_patch.py:102
└─ 11.704 export torch_onnx/_core.py:793
├─ 7.431 export torch/export/__init__.py:73
│ [253 frames hidden] torch, contextlib, copy, dis, importl...
└─ 4.272 exported_program_to_ir torch_onnx/_core.py:618
├─ 3.060 wrapper torch/export/exported_program.py:80
│ [78 frames hidden] torch, <string>
├─ 0.604 _add_nodes torch_onnx/_core.py:486
│ └─ 0.594 _handle_call_function_node_with_lowering torch_onnx/_core.py:356
│ └─ 0.401 TracedOnnxFunction.__call__ ../../onnxscript/onnxscript/values.py:581
│ └─ 0.239 SymbolicTensor.aten_view ../../onnxscript/onnxscript/function_libs/torch_lib/ops/core.py:8740
│ └─ 0.144 Opset18.Cast ../../onnxscript/onnxscript/onnx_opset/_impl/opset13.py:241
│ └─ 0.141 Op.__call__ ../../onnxscript/onnxscript/values.py:291
│ └─ 0.140 OpRecorder.eval torch_onnx/_building.py:390
├─ 0.308 OnnxRegistry.from_torchlib torch_onnx/_registration.py:114
│ └─ 0.145 _get_overload torch_onnx/_registration.py:57
│ └─ 0.140 <module> torchvision/__init__.py:1
└─ 0.279 insert_type_promotion_nodes torch_onnx/_fx_passes.py:13
└─ 0.257 wrapper torch/onnx/_internal/diagnostics/infra/decorator.py:71
[13 frames hidden] torch
_ ._ __/__ _ _ _ _ _/_ Recorded: 07:19:57 Samples: 17575
/_//_/// /_\ / //_// / //_'/ // Duration: 18.621 CPU time: 21.274
/ _/ v4.6.2
Program: /Users/justinc/Documents/GitHub/torch-onnx/venv/bin/optimum-cli export onnx --model openai/whisper-large-v3 whisper/ --no-post-process
18.621 export_pytorch optimum/exporters/onnx/convert.py:485
└─ 18.621 export optimum/exporters/onnx/convert.py:584
└─ 18.621 _torch_onnx_export torch_onnx/_patch.py:102
├─ 18.432 export torch_onnx/_core.py:793
│ ├─ 11.837 export torch/export/__init__.py:73
│ │ [272 frames hidden] torch, contextlib, copy, dis, optimum...
│ └─ 6.593 exported_program_to_ir torch_onnx/_core.py:618
│ ├─ 4.588 wrapper torch/export/exported_program.py:80
│ │ [76 frames hidden] torch, <string>
│ ├─ 1.147 _add_nodes torch_onnx/_core.py:486
│ │ └─ 1.129 _handle_call_function_node_with_lowering torch_onnx/_core.py:356
│ │ └─ 0.747 TracedOnnxFunction.__call__ ../../onnxscript/onnxscript/values.py:581
│ │ ├─ 0.472 SymbolicTensor.aten_view ../../onnxscript/onnxscript/function_libs/torch_lib/ops/core.py:8740
│ │ │ ├─ 0.267 Opset18.Cast ../../onnxscript/onnxscript/onnx_opset/_impl/opset13.py:241
│ │ │ │ └─ 0.260 Op.__call__ ../../onnxscript/onnxscript/values.py:291
│ │ │ │ └─ 0.254 OpRecorder.eval torch_onnx/_building.py:390
│ │ │ └─ 0.194 Opset18.Reshape ../../onnxscript/onnxscript/onnx_opset/_impl/opset14.py:876
│ │ │ └─ 0.189 Op.__call__ ../../onnxscript/onnxscript/values.py:291
│ │ └─ 0.192 SymbolicTensor.aten_clone ../../onnxscript/onnxscript/function_libs/torch_lib/ops/core.py:1687
│ │ └─ 0.189 Opset18.Identity ../../onnxscript/onnxscript/onnx_opset/_impl/opset16.py:240
│ └─ 0.674 insert_type_promotion_nodes torch_onnx/_fx_passes.py:13
│ └─ 0.636 wrapper torch/onnx/_internal/diagnostics/infra/decorator.py:71
│ [16 frames hidden] torch
└─ 0.188 ONNXProgram.save torch_onnx/_onnx_program.py:25
_ ._ __/__ _ _ _ _ _/_ Recorded: 07:20:18 Samples: 16959
/_//_/// /_\ / //_// / //_'/ // Duration: 17.888 CPU time: 20.886
/ _/ v4.6.2
Program: /Users/justinc/Documents/GitHub/torch-onnx/venv/bin/optimum-cli export onnx --model openai/whisper-large-v3 whisper/ --no-post-process
17.887 export_pytorch optimum/exporters/onnx/convert.py:485
└─ 17.887 export optimum/exporters/onnx/convert.py:584
└─ 17.887 _torch_onnx_export torch_onnx/_patch.py:102
└─ 17.759 export torch_onnx/_core.py:793
├─ 11.605 export torch/export/__init__.py:73
│ [310 frames hidden] torch, contextlib, copy, dis, ast, op...
└─ 6.150 exported_program_to_ir torch_onnx/_core.py:618
├─ 4.584 wrapper torch/export/exported_program.py:80
│ [79 frames hidden] torch, <string>
├─ 0.912 _add_nodes torch_onnx/_core.py:486
│ └─ 0.894 _handle_call_function_node_with_lowering torch_onnx/_core.py:356
│ └─ 0.542 TracedOnnxFunction.__call__ ../../onnxscript/onnxscript/values.py:581
│ └─ 0.369 SymbolicTensor.aten_view ../../onnxscript/onnxscript/function_libs/torch_lib/ops/core.py:8740
│ └─ 0.234 Opset18.Cast ../../onnxscript/onnxscript/onnx_opset/_impl/opset13.py:241
│ └─ 0.230 Op.__call__ ../../onnxscript/onnxscript/values.py:291
│ └─ 0.227 OpRecorder.eval torch_onnx/_building.py:390
└─ 0.469 insert_type_promotion_nodes torch_onnx/_fx_passes.py:13
└─ 0.436 wrapper torch/onnx/_internal/diagnostics/infra/decorator.py:71
[13 frames hidden] torch
mprof run optimum-cli export onnx --model openai/whisper-large-
v3 whisper/ --no-post-process
mprof: Sampling memory every 0.1s
running new process
Framework not specified. Using pt to export the model.
/Users/justinc/Documents/GitHub/torch-onnx/venv/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Automatic task detection to automatic-speech-recognition-with-past (possible synonyms are: speech2seq-lm-with-past).
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Using the export variant default. Available variants are:
- default: The default ONNX variant.
Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file (https://huggingface.co/docs/transformers/generation_strategies#save-a-custom-decoding-strategy-with-your-model) instead. This warning will be raised to an exception in v4.41.
Non-default generation parameters: {'max_length': 448, 'begin_suppress_tokens': [220, 50257]}
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
***** Exporting submodel 1/3: WhisperEncoder *****
Using framework PyTorch: 2.3.1
Overriding 1 configuration item(s)
- use_cache -> False
Obtain model graph for `WhisperEncoder([...]` with `torch.export.export`...
Obtain model graph for `WhisperEncoder([...]` with `torch.export.export`... ✅
Translate the graph into ONNX...
aten::getitem is not found in this version of PyTorch.
/Users/justinc/Documents/GitHub/torch-onnx/src/torch_onnx/_registration.py:134: UserWarning: aten::getitem does not have a default overload or is not found. Ignoring.
warnings.warn(
Translate the graph into ONNX... ✅
The initializers have been removed from the model. This is destructive. Developers: Please implement ir.Model copy() and remove initializers on the copied model.
Filename: /Users/justinc/Documents/GitHub/torch-onnx/venv/lib/python3.11/site-packages/optimum/exporters/onnx/convert.py
Line # Mem usage Increment Occurrences Line Contents
=============================================================
583 6302.5 MiB 6302.5 MiB 1 @memory_profiler.profile
584 def export():
585 # export_options = torch.onnx.ExportOptions(dynamic_shapes=False)
586 # onnx_program = torch.onnx.dynamo_export(
587 # model,
588 # export_options = export_options,
589 # **dummy_inputs,
590 # )
591 # onnx_program.save(output.as_posix())
592 6430.7 MiB 128.2 MiB 2 onnx_export(
593 6302.5 MiB 0.0 MiB 1 model,
594 6302.5 MiB 0.0 MiB 1 (dummy_inputs,),
595 6302.5 MiB 0.0 MiB 1 f=output.as_posix(),
596 6302.5 MiB 0.0 MiB 1 input_names=input_names,
597 6302.5 MiB 0.0 MiB 1 output_names=output_names,
598 # dynamic_axes=dynamix_axes,
599 6302.5 MiB 0.0 MiB 1 do_constant_folding=do_constant_folding,
600 6302.5 MiB 0.0 MiB 1 opset_version=opset,
601 6302.5 MiB 0.0 MiB 1 export_params=False, # MARK
602 )
***** Exporting submodel 2/3: WhisperForConditionalGeneration *****
Using framework PyTorch: 2.3.1
Overriding 1 configuration item(s)
- use_cache -> True
Obtain model graph for `WhisperForConditionalGeneration([...]` with `torch.export.export`...
Obtain model graph for `WhisperForConditionalGeneration([...]` with `torch.export.export`... ✅
Translate the graph into ONNX...
aten::getitem is not found in this version of PyTorch.
/Users/justinc/Documents/GitHub/torch-onnx/src/torch_onnx/_registration.py:134: UserWarning: aten::getitem does not have a default overload or is not found. Ignoring.
warnings.warn(
Translate the graph into ONNX... ✅
The initializers have been removed from the model. This is destructive. Developers: Please implement ir.Model copy() and remove initializers on the copied model.
Filename: /Users/justinc/Documents/GitHub/torch-onnx/venv/lib/python3.11/site-packages/optimum/exporters/onnx/convert.py
Line # Mem usage Increment Occurrences Line Contents
=============================================================
583 6430.8 MiB 6430.8 MiB 1 @memory_profiler.profile
584 def export():
585 # export_options = torch.onnx.ExportOptions(dynamic_shapes=False)
586 # onnx_program = torch.onnx.dynamo_export(
587 # model,
588 # export_options = export_options,
589 # **dummy_inputs,
590 # )
591 # onnx_program.save(output.as_posix())
592 6535.2 MiB 104.4 MiB 2 onnx_export(
593 6430.8 MiB 0.0 MiB 1 model,
594 6430.8 MiB 0.0 MiB 1 (dummy_inputs,),
595 6430.8 MiB 0.0 MiB 1 f=output.as_posix(),
596 6430.8 MiB 0.0 MiB 1 input_names=input_names,
597 6430.8 MiB 0.0 MiB 1 output_names=output_names,
598 # dynamic_axes=dynamix_axes,
599 6430.8 MiB 0.0 MiB 1 do_constant_folding=do_constant_folding,
600 6430.8 MiB 0.0 MiB 1 opset_version=opset,
601 6430.8 MiB 0.0 MiB 1 export_params=False, # MARK
602 )
***** Exporting submodel 3/3: WhisperForConditionalGeneration *****
Using framework PyTorch: 2.3.1
Overriding 1 configuration item(s)
- use_cache -> True
Obtain model graph for `WhisperForConditionalGeneration([...]` with `torch.export.export`...
Obtain model graph for `WhisperForConditionalGeneration([...]` with `torch.export.export`... ✅
Translate the graph into ONNX...
aten::getitem is not found in this version of PyTorch.
/Users/justinc/Documents/GitHub/torch-onnx/src/torch_onnx/_registration.py:134: UserWarning: aten::getitem does not have a default overload or is not found. Ignoring.
warnings.warn(
Translate the graph into ONNX... ✅
The initializers have been removed from the model. This is destructive. Developers: Please implement ir.Model copy() and remove initializers on the copied model.
Filename: /Users/justinc/Documents/GitHub/torch-onnx/venv/lib/python3.11/site-packages/optimum/exporters/onnx/convert.py
Line # Mem usage Increment Occurrences Line Contents
=============================================================
583 6547.8 MiB 6547.8 MiB 1 @memory_profiler.profile
584 def export():
585 # export_options = torch.onnx.ExportOptions(dynamic_shapes=False)
586 # onnx_program = torch.onnx.dynamo_export(
587 # model,
588 # export_options = export_options,
589 # **dummy_inputs,
590 # )
591 # onnx_program.save(output.as_posix())
592 6585.6 MiB 37.7 MiB 2 onnx_export(
593 6547.8 MiB 0.0 MiB 1 model,
594 6547.8 MiB 0.0 MiB 1 (dummy_inputs,),
595 6547.8 MiB 0.0 MiB 1 f=output.as_posix(),
596 6547.8 MiB 0.0 MiB 1 input_names=input_names,
597 6547.8 MiB 0.0 MiB 1 output_names=output_names,
598 # dynamic_axes=dynamix_axes,
599 6547.8 MiB 0.0 MiB 1 do_constant_folding=do_constant_folding,
600 6547.8 MiB 0.0 MiB 1 opset_version=opset,
601 6547.8 MiB 0.0 MiB 1 export_params=False, # MARK
602 )
The ONNX export succeeded and the exported model was saved at: whisper
_ ._ __/__ _ _ _ _ _/_ Recorded: 07:22:04 Samples: 2717
/_//_/// /_\ / //_// / //_'/ // Duration: 17.402 CPU time: 71.175
/ _/ v4.6.2
Program: /Users/justinc/Documents/GitHub/torch-onnx/venv/bin/optimum-cli export onnx --model openai/whisper-large-v3 whisper_onnx_export/ --no-post-process
17.402 export_pytorch optimum/exporters/onnx/convert.py:485
└─ 17.402 export optimum/exporters/onnx/convert.py:584
[50 frames hidden] optimum, torch, transformers, <built-in>
4.404 PyCapsule._jit_pass_onnx_graph_shape_type_inference <built-in>
3.594 PyCapsule._jit_pass_onnx_graph_shape_type_inference <built-in>
_ ._ __/__ _ _ _ _ _/_ Recorded: 07:22:22 Samples: 7318
/_//_/// /_\ / //_// / //_'/ // Duration: 31.116 CPU time: 43.408
/ _/ v4.6.2
Program: /Users/justinc/Documents/GitHub/torch-onnx/venv/bin/optimum-cli export onnx --model openai/whisper-large-v3 whisper_onnx_export/ --no-post-process
31.116 export_pytorch optimum/exporters/onnx/convert.py:485
└─ 31.116 export optimum/exporters/onnx/convert.py:584
[51 frames hidden] optimum, torch, <built-in>, transformers
23.168 _optimize_graph torch/onnx/utils.py:574
├─ 13.022 PyCapsule._jit_pass_onnx_graph_shape_type_inference <built-in>
├─ 7.271 [self] torch/onnx/utils.py
_ ._ __/__ _ _ _ _ _/_ Recorded: 07:22:53 Samples: 6457
/_//_/// /_\ / //_// / //_'/ // Duration: 26.578 CPU time: 36.811
/ _/ v4.6.2
Program: /Users/justinc/Documents/GitHub/torch-onnx/venv/bin/optimum-cli export onnx --model openai/whisper-large-v3 whisper_onnx_export/ --no-post-process
26.578 export_pytorch optimum/exporters/onnx/convert.py:485
└─ 26.578 export optimum/exporters/onnx/convert.py:584
[51 frames hidden] optimum, torch, <built-in>, transformers
19.628 _optimize_graph torch/onnx/utils.py:574
├─ 10.655 PyCapsule._jit_pass_onnx_graph_shape_type_inference <built-in>
├─ 6.286 [self] torch/onnx/utils.py
5.489 PyCapsule._jit_pass_onnx_graph_shape_type_inference <built-in>
mprof run optimum-cli export onnx --model openai/whisper-large-
v3 whisper_onnx_export/ --no-post-process
mprof: Sampling memory every 0.1s
running new process
Framework not specified. Using pt to export the model.
/Users/justinc/Documents/GitHub/torch-onnx/venv/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Automatic task detection to automatic-speech-recognition-with-past (possible synonyms are: speech2seq-lm-with-past).
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Using the export variant default. Available variants are:
- default: The default ONNX variant.
Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file (https://huggingface.co/docs/transformers/generation_strategies#save-a-custom-decoding-strategy-with-your-model) instead. This warning will be raised to an exception in v4.41.
Non-default generation parameters: {'max_length': 448, 'begin_suppress_tokens': [220, 50257]}
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
***** Exporting submodel 1/3: WhisperEncoder *****
Using framework PyTorch: 2.3.1
Overriding 1 configuration item(s)
- use_cache -> False
/Users/justinc/Documents/GitHub/torch-onnx/venv/lib/python3.11/site-packages/transformers/models/whisper/modeling_whisper.py:1159: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if input_features.shape[-1] != expected_seq_length:
/Users/justinc/Documents/GitHub/torch-onnx/venv/lib/python3.11/site-packages/transformers/models/whisper/modeling_whisper.py:338: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if attn_weights.size() != (bsz * self.num_heads, tgt_len, src_len):
/Users/justinc/Documents/GitHub/torch-onnx/venv/lib/python3.11/site-packages/transformers/models/whisper/modeling_whisper.py:377: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if attn_output.size() != (bsz * self.num_heads, tgt_len, self.head_dim):
Filename: /Users/justinc/Documents/GitHub/torch-onnx/venv/lib/python3.11/site-packages/optimum/exporters/onnx/convert.py
Line # Mem usage Increment Occurrences Line Contents
=============================================================
583 6304.2 MiB 6304.2 MiB 1 @memory_profiler.profile
584 def export():
585 # export_options = torch.onnx.ExportOptions(dynamic_shapes=False)
586 # onnx_program = torch.onnx.dynamo_export(
587 # model,
588 # export_options = export_options,
589 # **dummy_inputs,
590 # )
591 # onnx_program.save(output.as_posix())
592 7520.5 MiB 1216.2 MiB 2 onnx_export(
593 6304.2 MiB 0.0 MiB 1 model,
594 6304.2 MiB 0.0 MiB 1 (dummy_inputs,),
595 6304.2 MiB 0.0 MiB 1 f=output.as_posix(),
596 6304.2 MiB 0.0 MiB 1 input_names=input_names,
597 6304.2 MiB 0.0 MiB 1 output_names=output_names,
598 # dynamic_axes=dynamix_axes,
599 6304.2 MiB 0.0 MiB 1 do_constant_folding=do_constant_folding,
600 6304.2 MiB 0.0 MiB 1 opset_version=opset,
601 6304.2 MiB 0.0 MiB 1 export_params=False, # MARK
602 )
***** Exporting submodel 2/3: WhisperForConditionalGeneration *****
Using framework PyTorch: 2.3.1
Overriding 1 configuration item(s)
- use_cache -> True
/Users/justinc/Documents/GitHub/torch-onnx/venv/lib/python3.11/site-packages/transformers/modeling_attn_mask_utils.py:86: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if input_shape[-1] > 1 or self.sliding_window is not None:
/Users/justinc/Documents/GitHub/torch-onnx/venv/lib/python3.11/site-packages/transformers/modeling_attn_mask_utils.py:162: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if past_key_values_length > 0:
/Users/justinc/Documents/GitHub/torch-onnx/venv/lib/python3.11/site-packages/transformers/models/whisper/modeling_whisper.py:345: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if attention_mask.size() != (bsz, 1, tgt_len, src_len):
Filename: /Users/justinc/Documents/GitHub/torch-onnx/venv/lib/python3.11/site-packages/optimum/exporters/onnx/convert.py
Line # Mem usage Increment Occurrences Line Contents
=============================================================
583 7520.6 MiB 7520.6 MiB 1 @memory_profiler.profile
584 def export():
585 # export_options = torch.onnx.ExportOptions(dynamic_shapes=False)
586 # onnx_program = torch.onnx.dynamo_export(
587 # model,
588 # export_options = export_options,
589 # **dummy_inputs,
590 # )
591 # onnx_program.save(output.as_posix())
592 7840.1 MiB 319.5 MiB 2 onnx_export(
593 7520.6 MiB 0.0 MiB 1 model,
594 7520.6 MiB 0.0 MiB 1 (dummy_inputs,),
595 7520.6 MiB 0.0 MiB 1 f=output.as_posix(),
596 7520.6 MiB 0.0 MiB 1 input_names=input_names,
597 7520.6 MiB 0.0 MiB 1 output_names=output_names,
598 # dynamic_axes=dynamix_axes,
599 7520.6 MiB 0.0 MiB 1 do_constant_folding=do_constant_folding,
600 7520.6 MiB 0.0 MiB 1 opset_version=opset,
601 7520.6 MiB 0.0 MiB 1 export_params=False, # MARK
602 )
***** Exporting submodel 3/3: WhisperForConditionalGeneration *****
Using framework PyTorch: 2.3.1
Overriding 1 configuration item(s)
- use_cache -> True
/Users/justinc/Documents/GitHub/torch-onnx/venv/lib/python3.11/site-packages/transformers/models/whisper/modeling_whisper.py:300: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
and past_key_value[0].shape[2] == key_value_states.shape[1]
Filename: /Users/justinc/Documents/GitHub/torch-onnx/venv/lib/python3.11/site-packages/optimum/exporters/onnx/convert.py
Line # Mem usage Increment Occurrences Line Contents
=============================================================
583 7840.1 MiB 7840.1 MiB 1 @memory_profiler.profile
584 def export():
585 # export_options = torch.onnx.ExportOptions(dynamic_shapes=False)
586 # onnx_program = torch.onnx.dynamo_export(
587 # model,
588 # export_options = export_options,
589 # **dummy_inputs,
590 # )
591 # onnx_program.save(output.as_posix())
592 7863.9 MiB 23.8 MiB 2 onnx_export(
593 7840.1 MiB 0.0 MiB 1 model,
594 7840.1 MiB 0.0 MiB 1 (dummy_inputs,),
595 7840.1 MiB 0.0 MiB 1 f=output.as_posix(),
596 7840.1 MiB 0.0 MiB 1 input_names=input_names,
597 7840.1 MiB 0.0 MiB 1 output_names=output_names,
598 # dynamic_axes=dynamix_axes,
599 7840.1 MiB 0.0 MiB 1 do_constant_folding=do_constant_folding,
600 7840.1 MiB 0.0 MiB 1 opset_version=opset,
601 7840.1 MiB 0.0 MiB 1 export_params=False, # MARK
602 )
The ONNX export succeeded and the exported model was saved at: whisper_onnx_export
mprof run optimum-cli export onnx --model openai/whisper-large-
v3 whisper_fake/ --no-post-process
mprof: Sampling memory every 0.1s
running new process
Framework not specified. Using pt to export the model.
/Users/justinc/Documents/GitHub/torch-onnx/venv/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Automatic task detection to automatic-speech-recognition-with-past (possible synonyms are: speech2seq-lm-with-past).
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Using the export variant default. Available variants are:
- default: The default ONNX variant.
Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file (https://huggingface.co/docs/transformers/generation_strategies#save-a-custom-decoding-strategy-with-your-model) instead. This warning will be raised to an exception in v4.41.
Non-default generation parameters: {'max_length': 448, 'begin_suppress_tokens': [220, 50257]}
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
***** Exporting submodel 1/3: WhisperEncoder *****
Using framework PyTorch: 2.3.1
Overriding 1 configuration item(s)
- use_cache -> False
Obtain model graph for `WhisperEncoder([...]` with `torch.export.export`...
Obtain model graph for `WhisperEncoder([...]` with `torch.export.export`... ✅
Translate the graph into ONNX...
aten::getitem is not found in this version of PyTorch.
/Users/justinc/Documents/GitHub/torch-onnx/src/torch_onnx/_registration.py:134: UserWarning: aten::getitem does not have a default overload or is not found. Ignoring.
warnings.warn(
Translate the graph into ONNX... ✅
The initializers have been removed from the model. This is destructive. Developers: Please implement ir.Model copy() and remove initializers on the copied model.
Filename: /Users/justinc/Documents/GitHub/torch-onnx/venv/lib/python3.11/site-packages/optimum/exporters/onnx/convert.py
Line # Mem usage Increment Occurrences Line Contents
=============================================================
583 429.2 MiB 429.2 MiB 1 @memory_profiler.profile
584 def export():
585 # export_options = torch.onnx.ExportOptions(dynamic_shapes=False)
586 # onnx_program = torch.onnx.dynamo_export(
587 # model,
588 # export_options = export_options,
589 # **dummy_inputs,
590 # )
591 # onnx_program.save(output.as_posix())
592 541.5 MiB 112.4 MiB 2 onnx_export(
593 429.2 MiB 0.0 MiB 1 model,
594 429.2 MiB 0.0 MiB 1 (dummy_inputs,),
595 429.2 MiB 0.0 MiB 1 f=output.as_posix(),
596 429.2 MiB 0.0 MiB 1 input_names=input_names,
597 429.2 MiB 0.0 MiB 1 output_names=output_names,
598 # dynamic_axes=dynamix_axes,
599 429.2 MiB 0.0 MiB 1 do_constant_folding=do_constant_folding,
600 429.2 MiB 0.0 MiB 1 opset_version=opset,
601 429.2 MiB 0.0 MiB 1 export_params=False, # MARK
602 )
***** Exporting submodel 2/3: WhisperForConditionalGeneration *****
Using framework PyTorch: 2.3.1
Overriding 1 configuration item(s)
- use_cache -> True
Obtain model graph for `WhisperForConditionalGeneration([...]` with `torch.export.export`...
Obtain model graph for `WhisperForConditionalGeneration([...]` with `torch.export.export`... ✅
Translate the graph into ONNX...
aten::getitem is not found in this version of PyTorch.
/Users/justinc/Documents/GitHub/torch-onnx/src/torch_onnx/_registration.py:134: UserWarning: aten::getitem does not have a default overload or is not found. Ignoring.
warnings.warn(
Translate the graph into ONNX... ✅
The initializers have been removed from the model. This is destructive. Developers: Please implement ir.Model copy() and remove initializers on the copied model.
Filename: /Users/justinc/Documents/GitHub/torch-onnx/venv/lib/python3.11/site-packages/optimum/exporters/onnx/convert.py
Line # Mem usage Increment Occurrences Line Contents
=============================================================
583 541.6 MiB 541.6 MiB 1 @memory_profiler.profile
584 def export():
585 # export_options = torch.onnx.ExportOptions(dynamic_shapes=False)
586 # onnx_program = torch.onnx.dynamo_export(
587 # model,
588 # export_options = export_options,
589 # **dummy_inputs,
590 # )
591 # onnx_program.save(output.as_posix())
592 635.7 MiB 94.1 MiB 2 onnx_export(
593 541.6 MiB 0.0 MiB 1 model,
594 541.6 MiB 0.0 MiB 1 (dummy_inputs,),
595 541.6 MiB 0.0 MiB 1 f=output.as_posix(),
596 541.6 MiB 0.0 MiB 1 input_names=input_names,
597 541.6 MiB 0.0 MiB 1 output_names=output_names,
598 # dynamic_axes=dynamix_axes,
599 541.6 MiB 0.0 MiB 1 do_constant_folding=do_constant_folding,
600 541.6 MiB 0.0 MiB 1 opset_version=opset,
601 541.6 MiB 0.0 MiB 1 export_params=False, # MARK
602 )
***** Exporting submodel 3/3: WhisperForConditionalGeneration *****
Using framework PyTorch: 2.3.1
Overriding 1 configuration item(s)
- use_cache -> True
Obtain model graph for `WhisperForConditionalGeneration([...]` with `torch.export.export`...
Obtain model graph for `WhisperForConditionalGeneration([...]` with `torch.export.export`... ✅
Translate the graph into ONNX...
aten::getitem is not found in this version of PyTorch.
/Users/justinc/Documents/GitHub/torch-onnx/src/torch_onnx/_registration.py:134: UserWarning: aten::getitem does not have a default overload or is not found. Ignoring.
warnings.warn(
Translate the graph into ONNX... ✅
The initializers have been removed from the model. This is destructive. Developers: Please implement ir.Model copy() and remove initializers on the copied model.
Filename: /Users/justinc/Documents/GitHub/torch-onnx/venv/lib/python3.11/site-packages/optimum/exporters/onnx/convert.py
Line # Mem usage Increment Occurrences Line Contents
=============================================================
583 635.7 MiB 635.7 MiB 1 @memory_profiler.profile
584 def export():
585 # export_options = torch.onnx.ExportOptions(dynamic_shapes=False)
586 # onnx_program = torch.onnx.dynamo_export(
587 # model,
588 # export_options = export_options,
589 # **dummy_inputs,
590 # )
591 # onnx_program.save(output.as_posix())
592 670.1 MiB 34.4 MiB 2 onnx_export(
593 635.7 MiB 0.0 MiB 1 model,
594 635.7 MiB 0.0 MiB 1 (dummy_inputs,),
595 635.7 MiB 0.0 MiB 1 f=output.as_posix(),
596 635.7 MiB 0.0 MiB 1 input_names=input_names,
597 635.7 MiB 0.0 MiB 1 output_names=output_names,
598 # dynamic_axes=dynamix_axes,
599 635.7 MiB 0.0 MiB 1 do_constant_folding=do_constant_folding,
600 635.7 MiB 0.0 MiB 1 opset_version=opset,
601 635.7 MiB 0.0 MiB 1 export_params=False, # MARK
602 )
The ONNX export succeeded and the exported model was saved at: whisper_fake
torch.onnx.dynamo_export
Profiling
Memory profiling
mprofile_20240625071007.txt