microsoft / Olive

Olive: Simplify ML Model Finetuning, Conversion, Quantization, and Optimization for CPUs, GPUs and NPUs.
https://microsoft.github.io/Olive/
MIT License
1.62k stars 172 forks source link

Whisper with DirectML EP not working: [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for DecoderMaskedMultiHeadAttention(1) node with name 'Attention_0' #1213

Open WA225 opened 5 months ago

WA225 commented 5 months ago

Describe the bug I am trying to run Whisper on an AMD Radeon 780M Graphics using DirectML EP but it is showing the Not Implemented error below.

To Reproduce python -m pip install onnxruntime-directml python -m olive.workflows.run --config whisper_dml_fp32.json --setup python -m pip install onnxruntime_extensions python -m olive.workflows.run --config whisper_dml_fp32.json

Olive config whisper_dml_fp32.json: { "input_model": { "type": "PyTorchModel", "config": { "model_script": "code/user_script.py", "script_dir": "code", "hf_config": { "model_class": "WhisperForConditionalGeneration", "model_name": "openai/whisper-tiny.en", "components": [ { "name": "encoder_decoder_init", "io_config": "get_encdec_io_config", "component_func": "get_encoder_decoder_init", "dummy_inputs_func": "encoder_decoder_init_dummy_inputs" }, { "name": "decoder", "io_config": "get_dec_io_config", "component_func": "get_decoder", "dummy_inputs_func": "decoder_dummy_inputs" } ], "from_pretrained_args": { "attn_implementation": "eager" } } } }, "systems": { "local_system": { "type": "LocalSystem", "config": { "accelerators": [ { "device": "gpu", "execution_providers": [ "DmlExecutionProvider" ] } ] } } }, "evaluators": { "common_evaluator": { "metrics": [ { "name": "latency", "type": "latency", "sub_types": [ { "name": "avg", "priority": 1 } ], "user_config": { "user_script": "code/user_script.py", "script_dir": "code", "data_dir": "data", "dataloader_func": "whisper_dataloader", "func_kwargs": { "dataloader_func": { "model_name": "openai/whisper-tiny.en", "use_audio_decoder": true } } } } ] } }, "passes": { "conversion": { "type": "OnnxConversion", "config": { "target_opset": 17 } }, "transformers_optimization": { "type": "OrtTransformersOptimization", "config": { "optimization_options": { "use_multi_head_attention": true }, "use_gpu": true } }, "insert_beam_search": { "type": "InsertBeamSearch", "config": { "use_forced_decoder_ids": false, "use_logits_processor": false, "fp16": false } }, "prepost": { "type": "AppendPrePostProcessingOps", "config": { "tool_command": "whisper", "tool_command_args": { "model_name": "openai/whisper-tiny.en", "use_audio_decoder": true }, "target_opset": 17 } } }, "engine": { "log_severity_level": 0, "host": "local_system", "target": "local_system", "evaluator": "common_evaluator", "evaluate_input_model": false, "clean_cache": false, "cache_dir": "cache", "output_dir": "models", "output_name": "whisper_dml_fp32" } }

Olive logs After it reached the step in which it tries to run Olive on gpu-dml, it fails giving this error: onnxruntime.capi.onnxruntime_pybind11_state.NotImplemented: [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for DecoderMaskedMultiHeadAttention(1) node with name 'Attention_0'

Other information

jambayk commented 5 months ago

Hi,

could you try the workflow again by adding "use_gpu": false under insert_beam_search.config? The whisper example has not been tested with DML ep so the insert beam search assumes it's only run with cpu or cuda ep

this part adds DecoderMaskedMultiHeadAttention to the model but the operator doesn't appear to be implemented for DML ep https://github.com/microsoft/Olive/blob/7fa2c4138f085c339b9c5da5d4a57fadd2de104a/olive/passes/onnx/insert_beam_search.py#L272

WA225 commented 5 months ago

Thank you for your reply. This fixed the NOT_IMPLEMENTED error, however I still cannot get the correct behavior although i do not get any error messages anymore. After it starts evaluating the model, it fails without any error messages. The last logger output i get is the following:

[2024-06-27 10:24:45,683] [DEBUG] [ort_inference.py:72:get_ort_inference_session] inference_settings: {'execution_provider': ['DmlExecutionProvider'], 'provider_options': None} [2024-06-27 10:24:45,683] [DEBUG] [ort_inference.py:111:get_ort_inference_session] Normalized providers: ['DmlExecutionProvider'], provider_options: [{}]

Any idea why this happens? Could it be failing when trying the create the InferenceSession? https://github.com/microsoft/Olive/blob/b59ef7d6d1384b822c6e8175177cf6a9b1aacdf0/olive/common/ort_inference.py#L118

jambayk commented 5 months ago

Can you paste a dump of the full log?

WA225 commented 5 months ago

Sure. Here is the full log:

[2024-06-27 14:46:32,154] [INFO] [run.py:138:run_engine] Running workflow default_workflow [2024-06-27 14:46:32,169] [INFO] [engine.py:986:save_olive_config] Saved Olive config to cache\default_workflow\olive_config.json [2024-06-27 14:46:32,169] [DEBUG] [run.py:179:run_engine] Registering pass OnnxConversion [2024-06-27 14:46:32,182] [DEBUG] [run.py:179:run_engine] Registering pass OrtTransformersOptimization [2024-06-27 14:46:32,185] [DEBUG] [run.py:179:run_engine] Registering pass InsertBeamSearch [2024-06-27 14:46:32,186] [DEBUG] [run.py:179:run_engine] Registering pass AppendPrePostProcessingOps [2024-06-27 14:46:32,188] [DEBUG] [accelerator_creator.py:130:_fill_accelerators] The accelerator device and execution providers are specified, skipping deduce. [2024-06-27 14:46:32,188] [DEBUG] [accelerator_creator.py:169:_check_execution_providers] Supported execution providers for device gpu: ['DmlExecutionProvider', 'CPUExecutionProvider'] [2024-06-27 14:46:32,188] [DEBUG] [accelerator_creator.py:199:create_accelerators] Initial accelerators and execution providers: {'gpu': ['DmlExecutionProvider']} [2024-06-27 14:46:32,188] [INFO] [accelerator_creator.py:224:create_accelerators] Running workflow on accelerator specs: gpu-dml [2024-06-27 14:46:32,188] [DEBUG] [run.py:235:run_engine] Pass OnnxConversion already registered [2024-06-27 14:46:32,188] [DEBUG] [run.py:235:run_engine] Pass OrtTransformersOptimization already registered [2024-06-27 14:46:32,188] [DEBUG] [run.py:235:run_engine] Pass InsertBeamSearch already registered [2024-06-27 14:46:32,188] [DEBUG] [run.py:235:run_engine] Pass AppendPrePostProcessingOps already registered [2024-06-27 14:46:32,188] [INFO] [engine.py:109:initialize] Using cache directory: cache\default_workflow [2024-06-27 14:46:32,197] [INFO] [engine.py:265:run] Running Olive on accelerator: gpu-dml [2024-06-27 14:46:32,198] [INFO] [engine.py:1085:_create_system] Creating target system ... [2024-06-27 14:46:32,198] [DEBUG] [engine.py:1081:create_system] create native OliveSystem SystemType.Local [2024-06-27 14:46:32,198] [INFO] [engine.py:1088:_create_system] Target system created in 0.000000 seconds [2024-06-27 14:46:32,199] [INFO] [engine.py:1097:_create_system] Creating host system ... [2024-06-27 14:46:32,199] [DEBUG] [engine.py:1081:create_system] create native OliveSystem SystemType.Local [2024-06-27 14:46:32,199] [INFO] [engine.py:1100:_create_system] Host system created in 0.000000 seconds [2024-06-27 14:46:32,237] [DEBUG] [engine.py:711:_cache_model] Cached model df880b77 to cache\default_workflow\models\df880b77.json [2024-06-27 14:46:32,237] [DEBUG] [engine.py:338:run_accelerator] Running Olive in no-search mode ... [2024-06-27 14:46:32,237] [DEBUG] [engine.py:430:run_no_search] Running ['conversion', 'transformers_optimization', 'insert_beam_search', 'prepost'] with no search ... [2024-06-27 14:46:32,237] [INFO] [engine.py:867:_run_pass] Running pass conversion:OnnxConversion [2024-06-27 14:46:32,237] [DEBUG] [resource_path.py:156:create_resource_path] Resource path code/user_script.py is inferred to be of type file. [2024-06-27 14:46:32,237] [DEBUG] [resource_path.py:156:create_resource_path] Resource path code is inferred to be of type folder. [2024-06-27 14:46:32,237] [DEBUG] [resource_path.py:156:create_resource_path] Resource path code is inferred to be of type folder. [2024-06-27 14:46:32,247] [DEBUG] [resource_path.py:156:create_resource_path] Resource path code/user_script.py is inferred to be of type file. C:\anaconda3\envs\olv-whisper\Lib\site-packages\huggingface_hub\file_download.py:1132: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True. warnings.warn( [2024-06-27 14:46:32,460] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\code is inferred to be of type folder. [2024-06-27 14:46:32,462] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\code\user_script.py is inferred to be of type file. [2024-06-27 14:46:32,494] [INFO] [hf_config.py:112:load_hf_model] Loading Huggingface model from openai/whisper-tiny.en [2024-06-27 14:46:34,231] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\code is inferred to be of type folder. [2024-06-27 14:46:34,231] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\code\user_script.py is inferred to be of type file. C:\anaconda3\envs\olv-whisper\Lib\site-packages\huggingface_hub\file_download.py:1132: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.
warnings.warn( [2024-06-27 14:46:34,362] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\code is inferred to be of type folder. [2024-06-27 14:46:34,365] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\code\user_script.py is inferred to be of type file. [2024-06-27 14:46:34,368] [DEBUG] [dummy_inputs.py:45:get_dummy_inputs] Using dummy_inputs_func to get dummy inputs [2024-06-27 14:46:34,482] [DEBUG] [pytorch.py:262:get_user_io_config] Calling get_encdec_io_config to get io_config [2024-06-27 14:46:35,607] [DEBUG] [conversion.py:234:_export_pytorch_model] Converting model on device cpu with dtype None. C:\anaconda3\envs\olv-whisper\Lib\site-packages\transformers\models\whisper\modeling_whisper.py:1159: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if input_features.shape[-1] != expected_seq_length: C:\anaconda3\envs\olv-whisper\Lib\site-packages\transformers\models\whisper\modeling_whisper.py:338: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if attn_weights.size() != (bsz self.num_heads, tgt_len, src_len): C:\anaconda3\envs\olv-whisper\Lib\site-packages\transformers\models\whisper\modeling_whisper.py:377: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if attn_output.size() != (bsz self.num_heads, tgt_len, self.head_dim): C:\anaconda3\envs\olv-whisper\Lib\site-packages\transformers\modeling_attn_mask_utils.py:86: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if input_shape[-1] > 1 or self.sliding_window is not None: C:\anaconda3\envs\olv-whisper\Lib\site-packages\transformers\modeling_attn_mask_utils.py:162: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if past_key_values_length > 0: C:\anaconda3\envs\olv-whisper\Lib\site-packages\transformers\models\whisper\modeling_whisper.py:345: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if attention_mask.size() != (bsz, 1, tgt_len, src_len): [2024-06-27 14:46:39,115] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\code is inferred to be of type folder. [2024-06-27 14:46:39,115] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\code\user_script.py is inferred to be of type file. [2024-06-27 14:46:39,254] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\code is inferred to be of type folder. [2024-06-27 14:46:39,262] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\code\user_script.py is inferred to be of type file. [2024-06-27 14:46:39,264] [DEBUG] [dummy_inputs.py:45:get_dummy_inputs] Using dummy_inputs_func to get dummy inputs [2024-06-27 14:46:39,435] [DEBUG] [pytorch.py:262:get_user_io_config] Calling get_dec_io_config to get io_config [2024-06-27 14:46:39,559] [DEBUG] [conversion.py:234:_export_pytorch_model] Converting model on device cpu with dtype None. C:\anaconda3\envs\olv-whisper\Lib\site-packages\transformers\models\whisper\modeling_whisper.py:300: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! and past_key_value[0].shape[2] == key_value_states.shape[1] [2024-06-27 14:46:41,895] [INFO] [engine.py:954:_run_pass] Pass conversion:OnnxConversion finished in 9.657734 seconds [2024-06-27 14:46:41,895] [DEBUG] [engine.py:711:_cache_model] Cached model 0_OnnxConversion-df880b77-ca0712dc to cache\default_workflow\models\0_OnnxConversion-df880b77-ca0712dc.json [2024-06-27 14:46:41,907] [DEBUG] [engine.py:794:_cache_run] Cached run for df880b77->0_OnnxConversion-df880b77-ca0712dc into cache\default_workflow\runs\OnnxConversion-df880b77-ca0712dc.json [2024-06-27 14:46:41,909] [INFO] [engine.py:867:_run_pass] Running pass transformers_optimization:OrtTransformersOptimization [2024-06-27 14:46:41,911] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\cache\default_workflow\models\0_OnnxConversion-df880b77-ca0712dc\output_model\encoder_decoder_init\model.onnx is inferred to be of type file. [2024-06-27 14:46:41,911] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\cache\default_workflow\models\0_OnnxConversion-df880b77-ca0712dc\output_model\decoder\model.onnx is inferred to be of type file. [2024-06-27 14:46:42,022] [DEBUG] [transformer_optimization.py:253:_run_for_config] model_type is set to bart from model attributes [2024-06-27 14:46:42,022] [DEBUG] [transformer_optimization.py:259:_run_for_config] num_heads is set to 6 from model attributes [2024-06-27 14:46:42,022] [DEBUG] [transformer_optimization.py:265:_run_for_config] hidden_size is set to 384 from model attributes [2024-06-27 14:46:49,417] [DEBUG] [transformer_optimization.py:253:_run_for_config] model_type is set to bart from model attributes [2024-06-27 14:46:49,417] [DEBUG] [transformer_optimization.py:259:_run_for_config] num_heads is set to 6 from model attributes [2024-06-27 14:46:49,417] [DEBUG] [transformer_optimization.py:265:_run_for_config] hidden_size is set to 384 from model attributes [2024-06-27 14:46:52,519] [INFO] [engine.py:954:_run_pass] Pass transformers_optimization:OrtTransformersOptimization finished in 10.608429 seconds [2024-06-27 14:46:52,535] [DEBUG] [engine.py:711:_cache_model] Cached model 1_OrtTransformersOptimization-0-5c93fa9e-gpu-dml to cache\default_workflow\models\1_OrtTransformersOptimization-0-5c93fa9e-gpu-dml.json [2024-06-27 14:46:52,540] [DEBUG] [engine.py:794:_cache_run] Cached run for 0_OnnxConversion-df880b77-ca0712dc->1_OrtTransformersOptimization-0-5c93fa9e-gpu-dml into cache\default_workflow\runs\OrtTransformersOptimization-0-5c93fa9e-gpu-dml.json [2024-06-27 14:46:52,542] [INFO] [engine.py:867:_run_pass] Running pass insert_beam_search:InsertBeamSearch [2024-06-27 14:46:52,545] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\cache\default_workflow\models\1_OrtTransformersOptimization-0-5c93fa9e-gpu-dml\output_model\encoder_decoder_init\model.onnx is inferred to be of type file. [2024-06-27 14:46:52,550] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\cache\default_workflow\models\1_OrtTransformersOptimization-0-5c93fa9e-gpu-dml\output_model\decoder\model.onnx is inferred to be of type file. Removed 18 initializers with duplicated value Removed 18 initializers with duplicated value [2024-06-27 14:46:53,795] [DEBUG] [insert_beam_search.py:302:chain_model] Using IR version 8 for chained model [2024-06-27 14:46:55,804] [INFO] [engine.py:954:_run_pass] Pass insert_beam_search:InsertBeamSearch finished in 3.258918 seconds [2024-06-27 14:46:55,820] [DEBUG] [engine.py:711:_cache_model] Cached model 2_InsertBeamSearch-1-82bf64f8 to cache\default_workflow\models\2_InsertBeamSearch-1-82bf64f8.json [2024-06-27 14:46:55,830] [DEBUG] [engine.py:794:_cache_run] Cached run for 1_OrtTransformersOptimization-0-5c93fa9e-gpu-dml->2_InsertBeamSearch-1-82bf64f8 into cache\default_workflow\runs\InsertBeamSearch-1-82bf64f8.json [2024-06-27 14:46:55,830] [INFO] [engine.py:867:_run_pass] Running pass prepost:AppendPrePostProcessingOps [2024-06-27 14:46:55,834] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\cache\default_workflow\models\2_InsertBeamSearch-1-82bf64f8\output_model\model_with_beam_search.onnx is inferred to be of type file. [2024-06-27 14:46:55,837] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\cache\default_workflow\models\2_InsertBeamSearch-1-82bf64f8\output_model\model_with_beam_search.onnx is inferred to be of type file. [W shape_type_inference.cpp:1972] Warning: The shape inference of ai.onnx.contrib::StftNorm type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (function UpdateReliable) [2024-06-27 14:46:57,206] [INFO] [engine.py:954:_run_pass] Pass prepost:AppendPrePostProcessingOps finished in 1.369488 seconds [2024-06-27 14:46:57,206] [DEBUG] [engine.py:711:_cache_model] Cached model 3_AppendPrePostProcessingOps-2-9e247843 to cache\default_workflow\models\3_AppendPrePostProcessingOps-2-9e247843.json [2024-06-27 14:46:57,206] [DEBUG] [engine.py:794:_cache_run] Cached run for 2_InsertBeamSearch-1-82bf64f8->3_AppendPrePostProcessingOps-2-9e247843 into cache\default_workflow\runs\AppendPrePostProcessingOps-2-9e247843.json [2024-06-27 14:46:57,206] [INFO] [engine.py:845:_run_passes] Run model evaluation for the final model... [2024-06-27 14:46:57,206] [DEBUG] [engine.py:1026:_evaluate_model] Evaluating model ... [2024-06-27 14:46:57,206] [DEBUG] [resource_path.py:156:create_resource_path] Resource path C:\Olive-main\examples\whisper\cache\default_workflow\models\3_AppendPrePostProcessingOps-2-9e247843\output_model\model_with_beam_search.onnx is inferred to be of type file. \Olive-main\examples\whisper\data is inferred to be of type folder. [2024-06-27 14:46:58,327] [DEBUG] [ort_inference.py:72:get_ort_inference_session] inference_settings: {'execution_provider': ['DmlExecutionProvider'], 'provider_options': None} [2024-06-27 14:46:58,327] [DEBUG] [ort_inference.py:111:get_ort_inference_session] Normalized providers: ['DmlExecutionProvider'], provider_options: [{}]

jambayk commented 5 months ago

that's weird that it just fails silently. Can you add "ort_log_severity_level" : 0 under "engine" in the config json to see if prints out any information about the failure?

WA225 commented 5 months ago

The output of the ort log is added below. I noticed the lines "ORT optimization- Force fallback to CPU execution for node: /whisper_decoder_init/model/decoder/Gather_1 because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node" for several operator, so I installed onnxruntime in addition to onnxruntime-directml, and now it generates an output called "whisper_dml_fp32_gpu-cpu_model.onnx" that fails when I try to test it with the command "python test_transcription.py --config whisper_dml_fp32.json" as it seems to support the CPU EP and not the DML EP.

The ort log output: 2024-06-28 10:26:22.5237536 [I:onnxruntime:, inference_session.cc:533 onnxruntime::InferenceSession::TraceSessionOptions] Session Options { execution_mode:0 execution_order:DEFAULT enable_profiling:0 optimized_model_filepath: enable_mem_pattern:0 enable_mem_reuse:1 enable_cpu_mem_arena:1 profile_file_prefix:onnxruntimeprofile session_logid: session_log_severity_level:-1 session_log_verbosity_level:0 max_num_graph_transformation_steps:10 graph_optimization_level:3 intra_op_param:OrtThreadPoolParams { thread_pool_size: 0 auto_set_affinity: 0 allow_spinning: 1 dynamic_blockbase: 0 stack_size: 0 affinity_str: set_denormal_as_zero: 0 } inter_op_param:OrtThreadPoolParams { thread_pool_size: 0 auto_set_affinity: 0 allow_spinning: 1 dynamic_blockbase: 0 stack_size: 0 affinity_str: set_denormal_as_zero: 0 } use_per_session_threads:1 thread_pool_allow_spinning:1 use_deterministic_compute:0 config_options: { } } 2024-06-28 10:26:22.5267229 [I:onnxruntime:, inference_session.cc:433 onnxruntime::InferenceSession::Cons tructorCommon::::ope rator ()] Flush-to-zero and denormal-as-zero are off 2024-06-28 10:26:22.5274654 [I:onnxruntime:, inference_session.cc:441 onnxruntime::InferenceSession::ConstructorCommon] Creating and using per session threadpools since use_per_sessionthreads is true 2024-06-28 10:26:22.5281724 [I:onnxruntime:, inference_session.cc:459 onnxruntime::InferenceSession::ConstructorCommon] Dynamic block base set to 0 2024-06-28 10:26:22.8706539 [I:onnxruntime:, inference_session.cc:1602 onnxruntime::InferenceSession::Initialize] Initializing session. 2024-06-28 10:26:22.8711560 [I:onnxruntime:, inference_session.cc:1639 onnxruntime::InferenceSession::Initialize] Adding default CPU execution provider. 2024-06-28 10:26:22.8716922 [I:onnxruntime:Default, bfc_arena.cc:29 onnxruntime::BFCArena::BFCArena] Creating BFCArena for Cpu with following configs: initial_chunk_size_bytes: 1048576 max_dead_bytes_per_chunk: 134217728 initial_growth_chunk_size_bytes: 2097152 max_power_of_two_extend_bytes: 1073741824 memory limit: 18446744073709551615 arena_extend_strategy: 0 2024-06-28 10:26:22.8727597 [V:onnxruntime:Default, bfc_arena.cc:66 onnxruntime::BFCArena::BFCArena] Creating 21 bins of max chunk size 256 to 268435456 2024-06-28 10:26:22.8750114 [I:onnxruntime:, graph_partitioner.cc:900 onnxruntime::GraphPartitioner::InlineFunctionsAOT] This model does not have any local functions defined. AOT Inlining is not performed 2024-06-28 10:26:22.8758565 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer EnsureUniqueDQForNodeUnit modified: 0 with status: OK 2024-06-28 10:26:22.8766265 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer Level1_RuleBasedTransformer modified: 0 with status: OK 2024-06-28 10:26:22.8771543 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer DoubleQDQPairsRemover modified: 0 with status: OK 2024-06-28 10:26:22.8778361 [I:onnxruntime:, constant_sharing.cc:248 onnxruntime::ConstantSharing::ApplyImpl] Total shared scalar initializer count: 10 2024-06-28 10:26:22.8782676 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer ConstantSharing modified: 1 with status: OK 2024-06-28 10:26:22.8815978 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer ShapeInputMerge modified: 0 with status: OK 2024-06-28 10:26:22.8830022 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer CommonSubexpressionElimination modified: 1 with status: OK 2024-06-28 10:26:22.8856318 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer ConstantFolding modified: 0 with status: OK 2024-06-28 10:26:22.8862964 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulAddFusion modified: 0 with status: OK 2024-06-28 10:26:22.8869357 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer ReshapeFusion modified: 0 with status: OK 2024-06-28 10:26:22.8874567 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer FreeDimensionOverrideTransformer modified: 0 with status: OK 2024-06-28 10:26:22.8881172 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQPropagationTransformer modified: 0 with status: OK 2024-06-28 10:26:22.8887025 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer EnsureUniqueDQForNodeUnit modified: 0 with status: OK 2024-06-28 10:26:22.8892458 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer RocmBlasAltImpl modified: 0 with status: OK 2024-06-28 10:26:22.8900674 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer TransposeOptimizer modified: 0 with status: OK 2024-06-28 10:26:22.8916210 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer Level1_RuleBasedTransformer modified: 0 with status: OK 2024-06-28 10:26:22.8921454 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer DoubleQDQPairsRemover modified: 0 with status: OK 2024-06-28 10:26:22.8928428 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer ShapeInputMerge modified: 0 with status: OK 2024-06-28 10:26:22.8942160 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer CommonSubexpressionElimination modified: 0 with status: OK 2024-06-28 10:26:22.8948504 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer ConstantFolding modified: 0 with status: OK 2024-06-28 10:26:22.8954079 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulAddFusion modified: 0 with status: OK 2024-06-28 10:26:22.8959476 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer ReshapeFusion modified: 0 with status: OK 2024-06-28 10:26:22.8964161 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer FreeDimensionOverrideTransformer modified: 0 with status: OK 2024-06-28 10:26:22.8970709 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQPropagationTransformer modified: 0 with status: OK 2024-06-28 10:26:22.8977246 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer EnsureUniqueDQForNodeUnit modified: 0 with status: OK 2024-06-28 10:26:22.8982432 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer RocmBlasAltImpl modified: 0 with status: OK 2024-06-28 10:26:22.8991822 [I:onnxruntime:Default, fallback_cpu_capability.cc:90 onnxruntime::GetCpuPreferredNo des::::operator ()] Candidate for fallback CPU execution: /whisper_decoder_init/model/decoder/Gather_1 2024-06-28 10:26:22.8998506 [I:onnxruntime:Default, fallback_cpu_capability.cc:90 onnxruntime::GetCpuPreferredNo des::::operator ()] Candidate for fallback CPU execution: /whisper_decoder_init/model/decoder/Gather 2024-06-28 10:26:22.9005349 [I:onnxruntime:Default, fallback_cpu_capability.cc:90 onnxruntime::GetCpuPreferredNo des::::operator ()] Candidate for fallback CPU execution: /whisper_decoder_init/model/decoder/Slice 2024-06-28 10:26:22.9014060 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /whisper_decoder_init/model/decoder/Gather_1 because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9023590 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /whisper_decoder_init/model/decoder/Unsqueeze_1 because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9032870 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /whisper_decoder_init/model/decoder/Concat_1 because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9142461 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /whisper_decoder_init/model/decoder/Slice because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9152593 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /whisper_decoder_init/model/decoder/Squeeze because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9162683 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /whisper_decoder_init/model/decoder/Gather because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9172550 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /whisper_decoder_init/model/decoder/Unsqueeze_6 because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9182713 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /whisper_decoder_init/model/decoder/Sub because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9192413 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /whisper_decoder_init/model/decoder/Add_1 because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9202246 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /whisper_decoder_init/model/decoder/Unsqueeze_8 because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9212625 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /whisper_decoder_init/model/decoder/Concat_3 because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9222818 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /whisper_decoder_init/model/decoder/Equal because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9240286 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /whisper_decoder_init/model/decoder/Where_1 because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9259094 [I:onnxruntime:Default, fallback_cpu_capability.cc:90 onnxruntime::GetCpuPreferredNo des::::operator ()] Candidate for fallback CPU execution: /decoder/model/decoder/Gather_1 2024-06-28 10:26:22.9266245 [I:onnxruntime:Default, fallback_cpu_capability.cc:90 onnxruntime::GetCpuPreferredNo des::::operator ()] Candidate for fallback CPU execution: /decoder/model/decoder/embed_positions/Gather 2024-06-28 10:26:22.9275610 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /decoder/model/decoder/Gather_1 because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9286380 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /decoder/model/decoder/embed_positions/Unsqueeze because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9301457 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /decoder/model/decoder/embed_positions/Gather because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9311491 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /decoder/model/decoder/embed_positions/Add because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9321414 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /decoder/model/decoder/embed_positions/Unsqueeze_1 because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9337088 [I:onnxruntime:Default, fallback_cpu_capability.cc:90 onnxruntime::GetCpuPreferredNo des::::operator ()] Candidate for fallback CPU execution: /Gather 2024-06-28 10:26:22.9343674 [I:onnxruntime:Default, fallback_cpu_capability.cc:90 onnxruntime::GetCpuPreferredNo des::::operator ()] Candidate for fallback CPU execution: /Gather_1 2024-06-28 10:26:22.9350183 [I:onnxruntime:Default, fallback_cpu_capability.cc:90 onnxruntime::GetCpuPreferredNo des::::operator ()] Candidate for fallback CPU execution: /Gather_2 2024-06-28 10:26:22.9359035 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /Gather_2 because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9372780 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /Sub_1 because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9381673 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /Unsqueeze_2 because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9390700 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /Gather_1 because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9406654 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /Unsqueeze_1 because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9416261 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /Gather because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9425853 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /Unsqueeze because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9435345 [I:onnxruntime:Default, fallback_cpu_capability.cc:162 onnxruntime::GetCpuPreferredNodes] ORT optimization- Force fallback to CPU execution for node: /Concat because the CPU execution path is deemed faster than overhead involved with execution on other EPs capable of executing this node 2024-06-28 10:26:22.9454560 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer TransposeOptimizer_CPUExecutionProvider modified: 0 with status: OK 2024-06-28 10:26:22.9460916 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQS8ToU8Transformer modified: 0 with status: OK 2024-06-28 10:26:22.9467610 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQSelectorActionTransformer modified: 0 with status: OK 2024-06-28 10:26:22.9473500 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GemmActivationFusion modified: 0 with status: OK 2024-06-28 10:26:22.9479273 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulIntegerToFloatFusion modified: 0 with status: OK 2024-06-28 10:26:22.9485987 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer DynamicQuantizeMatMulFusion modified: 0 with status: OK 2024-06-28 10:26:22.9491957 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer ConvActivationFusion modified: 0 with status: OK 2024-06-28 10:26:22.9497841 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GeluFusion modified: 0 with status: OK 2024-06-28 10:26:22.9503174 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer LayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:22.9508809 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer SimplifiedLayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:22.9514343 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer AttentionFusion modified: 0 with status: OK 2024-06-28 10:26:22.9519890 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer EmbedLayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:22.9527713 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GatherSliceToSplitFusion modified: 0 with status: OK 2024-06-28 10:26:22.9533415 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GatherToSliceFusion modified: 0 with status: OK 2024-06-28 10:26:22.9539900 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatmulTransposeFusion modified: 0 with status: OK 2024-06-28 10:26:22.9545569 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasGeluFusion modified: 0 with status: OK 2024-06-28 10:26:22.9551109 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer SkipLayerNormFusion modified: 1 with status: OK 2024-06-28 10:26:22.9557798 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer FastGeluFusion modified: 0 with status: OK 2024-06-28 10:26:22.9563353 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QuickGeluFusion modified: 0 with status: OK 2024-06-28 10:26:22.9568706 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasSoftmaxFusion modified: 0 with status: OK 2024-06-28 10:26:22.9580923 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasDropoutFusion modified: 0 with status: OK 2024-06-28 10:26:22.9589503 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulScaleFusion modified: 0 with status: OK 2024-06-28 10:26:22.9595084 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulActivationFusion modified: 0 with status: OK 2024-06-28 10:26:22.9601069 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQFinalCleanupTransformer modified: 0 with status: OK 2024-06-28 10:26:22.9606378 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer DmlOperatorFusionTransformer modified: 0 with status: OK 2024-06-28 10:26:22.9612468 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQS8ToU8Transformer modified: 0 with status: OK 2024-06-28 10:26:22.9618838 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQSelectorActionTransformer modified: 0 with status: OK 2024-06-28 10:26:22.9624588 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GemmActivationFusion modified: 0 with status: OK 2024-06-28 10:26:22.9630429 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulIntegerToFloatFusion modified: 0 with status: OK 2024-06-28 10:26:22.9636139 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer DynamicQuantizeMatMulFusion modified: 0 with status: OK 2024-06-28 10:26:22.9641833 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer ConvActivationFusion modified: 0 with status: OK 2024-06-28 10:26:22.9648384 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GeluFusion modified: 0 with status: OK 2024-06-28 10:26:22.9654333 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer LayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:22.9660378 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer SimplifiedLayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:22.9665873 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer AttentionFusion modified: 0 with status: OK 2024-06-28 10:26:22.9671266 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer EmbedLayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:22.9679192 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GatherSliceToSplitFusion modified: 0 with status: OK 2024-06-28 10:26:22.9684968 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GatherToSliceFusion modified: 0 with status: OK 2024-06-28 10:26:22.9691675 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatmulTransposeFusion modified: 0 with status: OK 2024-06-28 10:26:22.9697585 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasGeluFusion modified: 0 with status: OK 2024-06-28 10:26:22.9703395 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer SkipLayerNormFusion modified: 1 with status: OK 2024-06-28 10:26:22.9709460 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer FastGeluFusion modified: 0 with status: OK 2024-06-28 10:26:22.9715333 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QuickGeluFusion modified: 0 with status: OK 2024-06-28 10:26:22.9720947 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasSoftmaxFusion modified: 0 with status: OK 2024-06-28 10:26:22.9728163 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasDropoutFusion modified: 0 with status: OK 2024-06-28 10:26:22.9734458 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulScaleFusion modified: 0 with status: OK 2024-06-28 10:26:22.9745115 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulActivationFusion modified: 0 with status: OK 2024-06-28 10:26:22.9751064 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQFinalCleanupTransformer modified: 0 with status: OK 2024-06-28 10:26:22.9756342 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer DmlOperatorFusionTransformer modified: 0 with status: OK 2024-06-28 10:26:22.9762350 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQS8ToU8Transformer modified: 0 with status: OK 2024-06-28 10:26:22.9768989 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQSelectorActionTransformer modified: 0 with status: OK 2024-06-28 10:26:22.9775222 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GemmActivationFusion modified: 0 with status: OK 2024-06-28 10:26:22.9780916 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulIntegerToFloatFusion modified: 0 with status: OK 2024-06-28 10:26:22.9786701 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer DynamicQuantizeMatMulFusion modified: 0 with status: OK 2024-06-28 10:26:22.9792525 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer ConvActivationFusion modified: 0 with status: OK 2024-06-28 10:26:22.9798074 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GeluFusion modified: 0 with status: OK 2024-06-28 10:26:22.9803308 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer LayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:22.9808707 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer SimplifiedLayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:22.9814289 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer AttentionFusion modified: 0 with status: OK 2024-06-28 10:26:22.9819720 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer EmbedLayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:22.9826979 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GatherSliceToSplitFusion modified: 0 with status: OK 2024-06-28 10:26:22.9832796 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GatherToSliceFusion modified: 0 with status: OK 2024-06-28 10:26:22.9842686 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatmulTransposeFusion modified: 0 with status: OK 2024-06-28 10:26:22.9848741 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasGeluFusion modified: 0 with status: OK 2024-06-28 10:26:22.9857837 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer SkipLayerNormFusion modified: 1 with status: OK 2024-06-28 10:26:22.9865724 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer FastGeluFusion modified: 0 with status: OK 2024-06-28 10:26:22.9871306 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QuickGeluFusion modified: 0 with status: OK 2024-06-28 10:26:22.9876691 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasSoftmaxFusion modified: 0 with status: OK 2024-06-28 10:26:22.9886586 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasDropoutFusion modified: 0 with status: OK 2024-06-28 10:26:22.9892758 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulScaleFusion modified: 0 with status: OK 2024-06-28 10:26:22.9898342 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulActivationFusion modified: 0 with status: OK 2024-06-28 10:26:22.9904256 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQFinalCleanupTransformer modified: 0 with status: OK 2024-06-28 10:26:22.9909976 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer DmlOperatorFusionTransformer modified: 0 with status: OK 2024-06-28 10:26:22.9915918 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQS8ToU8Transformer modified: 0 with status: OK 2024-06-28 10:26:22.9922182 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQSelectorActionTransformer modified: 0 with status: OK 2024-06-28 10:26:22.9927935 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GemmActivationFusion modified: 0 with status: OK 2024-06-28 10:26:22.9933444 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulIntegerToFloatFusion modified: 0 with status: OK 2024-06-28 10:26:22.9939131 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer DynamicQuantizeMatMulFusion modified: 0 with status: OK 2024-06-28 10:26:22.9944856 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer ConvActivationFusion modified: 0 with status: OK 2024-06-28 10:26:22.9950499 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GeluFusion modified: 0 with status: OK 2024-06-28 10:26:22.9955737 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer LayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:22.9961226 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer SimplifiedLayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:22.9966877 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer AttentionFusion modified: 0 with status: OK 2024-06-28 10:26:22.9972578 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer EmbedLayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:22.9979675 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GatherSliceToSplitFusion modified: 0 with status: OK 2024-06-28 10:26:22.9985458 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GatherToSliceFusion modified: 0 with status: OK 2024-06-28 10:26:22.9995747 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatmulTransposeFusion modified: 0 with status: OK 2024-06-28 10:26:23.0008676 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasGeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0014093 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer SkipLayerNormFusion modified: 1 with status: OK 2024-06-28 10:26:23.0019770 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer FastGeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0025216 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QuickGeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0030697 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasSoftmaxFusion modified: 0 with status: OK 2024-06-28 10:26:23.0036158 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasDropoutFusion modified: 0 with status: OK 2024-06-28 10:26:23.0042936 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulScaleFusion modified: 0 with status: OK 2024-06-28 10:26:23.0048816 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulActivationFusion modified: 0 with status: OK 2024-06-28 10:26:23.0054982 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQFinalCleanupTransformer modified: 0 with status: OK 2024-06-28 10:26:23.0060045 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer DmlOperatorFusionTransformer modified: 0 with status: OK 2024-06-28 10:26:23.0065699 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQS8ToU8Transformer modified: 0 with status: OK 2024-06-28 10:26:23.0071818 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQSelectorActionTransformer modified: 0 with status: OK 2024-06-28 10:26:23.0077548 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GemmActivationFusion modified: 0 with status: OK 2024-06-28 10:26:23.0083137 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulIntegerToFloatFusion modified: 0 with status: OK 2024-06-28 10:26:23.0088928 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer DynamicQuantizeMatMulFusion modified: 0 with status: OK 2024-06-28 10:26:23.0094633 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer ConvActivationFusion modified: 0 with status: OK 2024-06-28 10:26:23.0100227 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0106005 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer LayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:23.0112273 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer SimplifiedLayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:23.0118280 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer AttentionFusion modified: 0 with status: OK 2024-06-28 10:26:23.0124209 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer EmbedLayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:23.0130642 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GatherSliceToSplitFusion modified: 0 with status: OK 2024-06-28 10:26:23.0136176 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GatherToSliceFusion modified: 0 with status: OK 2024-06-28 10:26:23.0142196 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatmulTransposeFusion modified: 0 with status: OK 2024-06-28 10:26:23.0147785 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasGeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0153168 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer SkipLayerNormFusion modified: 1 with status: OK 2024-06-28 10:26:23.0158966 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer FastGeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0164630 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QuickGeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0170340 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasSoftmaxFusion modified: 0 with status: OK 2024-06-28 10:26:23.0184191 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasDropoutFusion modified: 0 with status: OK 2024-06-28 10:26:23.0191633 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulScaleFusion modified: 0 with status: OK 2024-06-28 10:26:23.0197059 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulActivationFusion modified: 0 with status: OK 2024-06-28 10:26:23.0211674 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQFinalCleanupTransformer modified: 0 with status: OK 2024-06-28 10:26:23.0216896 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer DmlOperatorFusionTransformer modified: 0 with status: OK 2024-06-28 10:26:23.0222636 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQS8ToU8Transformer modified: 0 with status: OK 2024-06-28 10:26:23.0228849 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQSelectorActionTransformer modified: 0 with status: OK 2024-06-28 10:26:23.0234647 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GemmActivationFusion modified: 0 with status: OK 2024-06-28 10:26:23.0240480 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulIntegerToFloatFusion modified: 0 with status: OK 2024-06-28 10:26:23.0246413 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer DynamicQuantizeMatMulFusion modified: 0 with status: OK 2024-06-28 10:26:23.0252488 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer ConvActivationFusion modified: 0 with status: OK 2024-06-28 10:26:23.0258403 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0271596 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer LayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:23.0277228 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer SimplifiedLayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:23.0282916 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer AttentionFusion modified: 0 with status: OK 2024-06-28 10:26:23.0288352 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer EmbedLayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:23.0294875 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GatherSliceToSplitFusion modified: 0 with status: OK 2024-06-28 10:26:23.0300556 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GatherToSliceFusion modified: 0 with status: OK 2024-06-28 10:26:23.0306500 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatmulTransposeFusion modified: 0 with status: OK 2024-06-28 10:26:23.0312220 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasGeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0317887 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer SkipLayerNormFusion modified: 1 with status: OK 2024-06-28 10:26:23.0324851 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer FastGeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0330602 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QuickGeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0336002 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasSoftmaxFusion modified: 0 with status: OK 2024-06-28 10:26:23.0341495 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasDropoutFusion modified: 0 with status: OK 2024-06-28 10:26:23.0347319 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulScaleFusion modified: 0 with status: OK 2024-06-28 10:26:23.0352958 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulActivationFusion modified: 0 with status: OK 2024-06-28 10:26:23.0358967 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQFinalCleanupTransformer modified: 0 with status: OK 2024-06-28 10:26:23.0365189 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer DmlOperatorFusionTransformer modified: 0 with status: OK 2024-06-28 10:26:23.0375731 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQS8ToU8Transformer modified: 0 with status: OK 2024-06-28 10:26:23.0383233 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQSelectorActionTransformer modified: 0 with status: OK 2024-06-28 10:26:23.0390068 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GemmActivationFusion modified: 0 with status: OK 2024-06-28 10:26:23.0395880 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulIntegerToFloatFusion modified: 0 with status: OK 2024-06-28 10:26:23.0402199 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer DynamicQuantizeMatMulFusion modified: 0 with status: OK 2024-06-28 10:26:23.0407904 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer ConvActivationFusion modified: 0 with status: OK 2024-06-28 10:26:23.0413552 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0418847 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer LayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:23.0424622 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer SimplifiedLayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:23.0430536 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer AttentionFusion modified: 0 with status: OK 2024-06-28 10:26:23.0436007 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer EmbedLayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:23.0449118 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GatherSliceToSplitFusion modified: 0 with status: OK 2024-06-28 10:26:23.0456095 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GatherToSliceFusion modified: 0 with status: OK 2024-06-28 10:26:23.0462811 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatmulTransposeFusion modified: 0 with status: OK 2024-06-28 10:26:23.0468769 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasGeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0474343 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer SkipLayerNormFusion modified: 1 with status: OK 2024-06-28 10:26:23.0480069 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer FastGeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0485487 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QuickGeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0490838 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasSoftmaxFusion modified: 0 with status: OK 2024-06-28 10:26:23.0496232 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasDropoutFusion modified: 0 with status: OK 2024-06-28 10:26:23.0503265 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulScaleFusion modified: 0 with status: OK 2024-06-28 10:26:23.0510625 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulActivationFusion modified: 0 with status: OK 2024-06-28 10:26:23.0517020 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQFinalCleanupTransformer modified: 0 with status: OK 2024-06-28 10:26:23.0522453 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer DmlOperatorFusionTransformer modified: 0 with status: OK 2024-06-28 10:26:23.0529030 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQS8ToU8Transformer modified: 0 with status: OK 2024-06-28 10:26:23.0535578 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQSelectorActionTransformer modified: 0 with status: OK 2024-06-28 10:26:23.0541365 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GemmActivationFusion modified: 0 with status: OK 2024-06-28 10:26:23.0547247 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulIntegerToFloatFusion modified: 0 with status: OK 2024-06-28 10:26:23.0567050 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer DynamicQuantizeMatMulFusion modified: 0 with status: OK 2024-06-28 10:26:23.0572742 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer ConvActivationFusion modified: 0 with status: OK 2024-06-28 10:26:23.0578402 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0583593 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer LayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:23.0589114 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer SimplifiedLayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:23.0595948 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer AttentionFusion modified: 0 with status: OK 2024-06-28 10:26:23.0603444 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer EmbedLayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:23.0621028 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GatherSliceToSplitFusion modified: 0 with status: OK 2024-06-28 10:26:23.0626998 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GatherToSliceFusion modified: 0 with status: OK 2024-06-28 10:26:23.0633681 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatmulTransposeFusion modified: 0 with status: OK 2024-06-28 10:26:23.0639376 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasGeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0644874 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer SkipLayerNormFusion modified: 1 with status: OK 2024-06-28 10:26:23.0650864 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer FastGeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0656479 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QuickGeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0662037 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasSoftmaxFusion modified: 0 with status: OK 2024-06-28 10:26:23.0669363 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasDropoutFusion modified: 0 with status: OK 2024-06-28 10:26:23.0675895 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulScaleFusion modified: 0 with status: OK 2024-06-28 10:26:23.0681356 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulActivationFusion modified: 0 with status: OK 2024-06-28 10:26:23.0687112 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQFinalCleanupTransformer modified: 0 with status: OK 2024-06-28 10:26:23.0692267 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer DmlOperatorFusionTransformer modified: 0 with status: OK 2024-06-28 10:26:23.0698059 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQS8ToU8Transformer modified: 0 with status: OK 2024-06-28 10:26:23.0704461 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQSelectorActionTransformer modified: 0 with status: OK 2024-06-28 10:26:23.0710245 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GemmActivationFusion modified: 0 with status: OK 2024-06-28 10:26:23.0715863 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulIntegerToFloatFusion modified: 0 with status: OK 2024-06-28 10:26:23.0721496 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer DynamicQuantizeMatMulFusion modified: 0 with status: OK 2024-06-28 10:26:23.0727319 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer ConvActivationFusion modified: 0 with status: OK 2024-06-28 10:26:23.0733404 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0742012 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer LayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:23.0747635 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer SimplifiedLayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:23.0753276 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer AttentionFusion modified: 0 with status: OK 2024-06-28 10:26:23.0759238 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer EmbedLayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:23.0766919 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GatherSliceToSplitFusion modified: 0 with status: OK 2024-06-28 10:26:23.0774063 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GatherToSliceFusion modified: 0 with status: OK 2024-06-28 10:26:23.0780385 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatmulTransposeFusion modified: 0 with status: OK 2024-06-28 10:26:23.0786017 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasGeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0792527 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer SkipLayerNormFusion modified: 1 with status: OK 2024-06-28 10:26:23.0800181 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer FastGeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0807099 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QuickGeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0812849 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasSoftmaxFusion modified: 0 with status: OK 2024-06-28 10:26:23.0818273 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasDropoutFusion modified: 0 with status: OK 2024-06-28 10:26:23.0824852 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulScaleFusion modified: 0 with status: OK 2024-06-28 10:26:23.0831776 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulActivationFusion modified: 0 with status: OK 2024-06-28 10:26:23.0838175 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQFinalCleanupTransformer modified: 0 with status: OK 2024-06-28 10:26:23.0843481 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer DmlOperatorFusionTransformer modified: 0 with status: OK 2024-06-28 10:26:23.0849438 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQS8ToU8Transformer modified: 0 with status: OK 2024-06-28 10:26:23.0855723 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQSelectorActionTransformer modified: 0 with status: OK 2024-06-28 10:26:23.0862536 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GemmActivationFusion modified: 0 with status: OK 2024-06-28 10:26:23.0870219 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulIntegerToFloatFusion modified: 0 with status: OK 2024-06-28 10:26:23.0876350 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer DynamicQuantizeMatMulFusion modified: 0 with status: OK 2024-06-28 10:26:23.0882401 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer ConvActivationFusion modified: 0 with status: OK 2024-06-28 10:26:23.0887945 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0893208 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer LayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:23.0898479 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer SimplifiedLayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:23.0903907 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer AttentionFusion modified: 0 with status: OK 2024-06-28 10:26:23.0909246 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer EmbedLayerNormFusion modified: 0 with status: OK 2024-06-28 10:26:23.0916341 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GatherSliceToSplitFusion modified: 0 with status: OK 2024-06-28 10:26:23.0922887 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer GatherToSliceFusion modified: 0 with status: OK 2024-06-28 10:26:23.0930768 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatmulTransposeFusion modified: 0 with status: OK 2024-06-28 10:26:23.0938376 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasGeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0944687 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer SkipLayerNormFusion modified: 1 with status: OK 2024-06-28 10:26:23.0950993 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer FastGeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0956647 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QuickGeluFusion modified: 0 with status: OK 2024-06-28 10:26:23.0962125 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasSoftmaxFusion modified: 0 with status: OK 2024-06-28 10:26:23.0967457 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer BiasDropoutFusion modified: 0 with status: OK 2024-06-28 10:26:23.0973303 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulScaleFusion modified: 0 with status: OK 2024-06-28 10:26:23.0978722 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MatMulActivationFusion modified: 0 with status: OK 2024-06-28 10:26:23.0984799 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer QDQFinalCleanupTransformer modified: 0 with status: OK 2024-06-28 10:26:23.0989915 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer DmlOperatorFusionTransformer modified: 0 with status: OK 2024-06-28 10:26:23.0995711 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer NchwcTransformer modified: 0 with status: OK 2024-06-28 10:26:23.1002272 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer NhwcTransformer modified: 0 with status: OK 2024-06-28 10:26:23.1008635 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer ConvAddActivationFusion modified: 0 with status: OK 2024-06-28 10:26:23.1029470 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer DmlGraphFusionTransformer modified: 0 with status: OK 2024-06-28 10:26:23.1036122 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer RemoveDuplicateCastTransformer modified: 0 with status: OK 2024-06-28 10:26:23.1041458 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer CastFloat16Transformer modified: 0 with status: OK 2024-06-28 10:26:23.1049910 [I:onnxruntime:, transformer_memcpy.cc:329 onnxruntime::TransformerMemcpyImpl::AddCopyNode] Add MemcpyFromHost after floatPCM for DmlExecutionProvider 2024-06-28 10:26:23.1055232 [I:onnxruntime:, transformer_memcpy.cc:329 onnxruntime::TransformerMemcpyImpl::AddCopyNode] Add MemcpyFromHost after sequences for DmlExecutionProvider 2024-06-28 10:26:23.1060435 [I:onnxruntime:, transformer_memcpy.cc:329 onnxruntime::TransformerMemcpyImpl::AddCopyNode] Add MemcpyToHost before generated_ids for DmlExecutionProvider 2024-06-28 10:26:23.1065658 [I:onnxruntime:, transformer_memcpy.cc:329 onnxruntime::TransformerMemcpyImpl::AddCopyNode] Add MemcpyToHost before log_mel for DmlExecutionProvider 2024-06-28 10:26:23.1082261 [I:onnxruntime:, transformermemcpy.cc:329 onnxruntime::TransformerMemcpyImpl::AddCopyNode] Add MemcpyFromHost after e/whisper_decoder_init/model/decoder/l ayers.0/self_attn/Reshape_9_output_0 for DmlExecutionProvider 2024-06-28 10:26:23.1089912 [I:onnxruntime:, transformermemcpy.cc:329 onnxruntime::TransformerMemcpyImpl::AddCopyNode] Add MemcpyFromHost after e/whisper_decoder_init/model/decoder/l ayers.1/self_attn/Reshape_9_output_0 for DmlExecutionProvider 2024-06-28 10:26:23.1097970 [I:onnxruntime:, transformermemcpy.cc:329 onnxruntime::TransformerMemcpyImpl::AddCopyNode] Add MemcpyFromHost after e/whisper_decoder_init/model/decoder/l ayers.2/self_attn/Reshape_9_output_0 for DmlExecutionProvider 2024-06-28 10:26:23.1105500 [I:onnxruntime:, transformermemcpy.cc:329 onnxruntime::TransformerMemcpyImpl::AddCopyNode] Add MemcpyFromHost after e/whisper_decoder_init/model/decoder/l ayers.3/self_attn/Reshape_9_output_0 for DmlExecutionProvider 2024-06-28 10:26:23.1112484 [I:onnxruntime:, transformer_memcpy.cc:329 onnxruntime::TransformerMemcpyImpl::AddCopyNode] Add MemcpyFromHost after e_present_self_0 for DmlExecutionProvider 2024-06-28 10:26:23.1117801 [I:onnxruntime:, transformer_memcpy.cc:329 onnxruntime::TransformerMemcpyImpl::AddCopyNode] Add MemcpyFromHost after e_present_self_1 for DmlExecutionProvider 2024-06-28 10:26:23.1123290 [I:onnxruntime:, transformer_memcpy.cc:329 onnxruntime::TransformerMemcpyImpl::AddCopyNode] Add MemcpyFromHost after e_present_self_2 for DmlExecutionProvider 2024-06-28 10:26:23.1128543 [I:onnxruntime:, transformer_memcpy.cc:329 onnxruntime::TransformerMemcpyImpl::AddCopyNode] Add MemcpyFromHost after e_present_self_3 for DmlExecutionProvider 2024-06-28 10:26:23.1134514 [I:onnxruntime:, transformermemcpy.cc:329 onnxruntime::TransformerMemcpyImpl::AddCopyNode] Add MemcpyToHost before e/whisper_decoder_init/model/decoder/Expand_output_0_mask for DmlExecutionProvider 2024-06-28 10:26:23.1141822 [I:onnxruntime:, transformermemcpy.cc:329 onnxruntime::TransformerMemcpyImpl::AddCopyNode] Add MemcpyToHost before e/whisper_decoder_init/model/decoder/la yers.0/self_attn_layer_norm/LayerNormalization_output_0 for DmlExecutionProvider 2024-06-28 10:26:23.1150476 [I:onnxruntime:, transformermemcpy.cc:329 onnxruntime::TransformerMemcpyImpl::AddCopyNode] Add MemcpyToHost before e/whisper_decoder_init/model/decoder/la yers.1/self_attn_layer_norm/LayerNormalization_output_0 for DmlExecutionProvider 2024-06-28 10:26:23.1159990 [I:onnxruntime:, transformermemcpy.cc:329 onnxruntime::TransformerMemcpyImpl::AddCopyNode] Add MemcpyToHost before e/whisper_decoder_init/model/decoder/la yers.2/self_attn_layer_norm/LayerNormalization_output_0 for DmlExecutionProvider 2024-06-28 10:26:23.1167947 [I:onnxruntime:, transformermemcpy.cc:329 onnxruntime::TransformerMemcpyImpl::AddCopyNode] Add MemcpyToHost before e/whisper_decoder_init/model/decoder/la yers.3/self_attn_layer_norm/LayerNormalization_output_0 for DmlExecutionProvider 2024-06-28 10:26:23.1184225 [I:onnxruntime:, graph_transformer.cc:15 onnxruntime::GraphTransformer::Apply] GraphTransformer MemcpyTransformer modified: 1 with status: OK 2024-06-28 10:26:23.1218394 [V:onnxruntime:, session_state.cc:1146 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Node placements 2024-06-28 10:26:23.1222587 [V:onnxruntime:, session_state.cc:1152 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Node(s) placed on [CPUExecutionProvider]. Number of nodes: 33 2024-06-28 10:26:23.1228194 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] AudioDecoder (AudioDecoder_1) 2024-06-28 10:26:23.1232436 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Gather (/Gather) 2024-06-28 10:26:23.1236086 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Gather (/Gather_1) 2024-06-28 10:26:23.1239815 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Gather (/Gather_2) 2024-06-28 10:26:23.1243639 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Sub (/Sub_1) 2024-06-28 10:26:23.1247269 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Unsqueeze (/Unsqueeze) 2024-06-28 10:26:23.1251361 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Unsqueeze (/Unsqueeze_1) 2024-06-28 10:26:23.1256230 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Unsqueeze (/Unsqueeze_2) 2024-06-28 10:26:23.1260933 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Concat (/Concat) 2024-06-28 10:26:23.1267203 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
WhisperBeamSearch (BeamSearch_node) 2024-06-28 10:26:23.1271542 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Gather (/whisper_decoder_init/model/decoder/Gather_1) 2024-06-28 10:26:23.1277113 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Sub (/whisper_decoder_init/model/decoder/Sub) 2024-06-28 10:26:23.1283651 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Add (/whisper_decoder_init/model/decoder/Add_1) 2024-06-28 10:26:23.1290626 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Unsqueeze (/whisper_decoder_init/model/decoder/Unsqueeze_8) 2024-06-28 10:26:23.1296246 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Gather (/whisper_decoder_init/model/decoder/Gather) 2024-06-28 10:26:23.1301069 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Unsqueeze (/whisper_decoder_init/model/decoder/Unsqueeze_6) 2024-06-28 10:26:23.1305930 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Concat (/whisper_decoder_init/model/decoder/Concat_3) 2024-06-28 10:26:23.1310630 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Equal (/whisper_decoder_init/model/decoder/Equal) 2024-06-28 10:26:23.1315180 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Where (/whisper_decoder_init/model/decoder/Where_1) 2024-06-28 10:26:23.1319794 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Unsqueeze (/whisper_decoder_init/model/decoder/Unsqueeze_1) 2024-06-28 10:26:23.1324649 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Concat (/whisper_decoder_init/model/decoder/Concat_1) 2024-06-28 10:26:23.1329327 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Slice (/whisper_decoder_init/model/decoder/Slice) 2024-06-28 10:26:23.1333978 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Squeeze (/whisper_decoder_init/model/decoder/Squeeze) 2024-06-28 10:26:23.1338681 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Attention (Attention_0) 2024-06-28 10:26:23.1342582 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Attention (Attention_6) 2024-06-28 10:26:23.1346717 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Attention (Attention_8) 2024-06-28 10:26:23.1351320 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Attention (Attention_10) 2024-06-28 10:26:23.1355264 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Gather (/decoder/model/decoder/embed_positions/Gather) 2024-06-28 10:26:23.1359974 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Gather (/decoder/model/decoder/Gather_1) 2024-06-28 10:26:23.1364488 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Add (/decoder/model/decoder/embed_positions/Add) 2024-06-28 10:26:23.1369169 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Unsqueeze (/decoder/model/decoder/embed_positions/Unsqueeze_1) 2024-06-28 10:26:23.1374267 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Unsqueeze (/decoder/model/decoder/embed_positions/Unsqueeze) 2024-06-28 10:26:23.1379208 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] BpeDecoder (BpeDecoder_3) 2024-06-28 10:26:23.1383234 [V:onnxruntime:, session_state.cc:1152 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Node(s) placed on [DmlExecutionProvider]. Number of nodes: 220 2024-06-28 10:26:23.1388204 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Pad () 2024-06-28 10:26:23.1391728 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Unsqueeze (unsqueeze_1) 2024-06-28 10:26:23.1395603 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] STFT (stft) 2024-06-28 10:26:23.1399130 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Transpose (transpose_1) 2024-06-28 10:26:23.1403126 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Slice (slice_1) 2024-06-28 10:26:23.1406828 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Gather (gather_4) 2024-06-28 10:26:23.1410683 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Gather (gather_5) 2024-06-28 10:26:23.1414653 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Mul (mul0) 2024-06-28 10:26:23.1418825 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Mul (mul1) 2024-06-28 10:26:23.1423977 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Add (add0) 2024-06-28 10:26:23.1428182 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Slice (/Slice) 2024-06-28 10:26:23.1433766 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/MatMul) 2024-06-28 10:26:23.1438119 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Clip (/Clip) 2024-06-28 10:26:23.1442383 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Log (/Log) 2024-06-28 10:26:23.1447111 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Div (/Div) 2024-06-28 10:26:23.1450721 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] ReduceMax (/ReduceMax) 2024-06-28 10:26:23.1454571 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Sub (/Sub) 2024-06-28 10:26:23.1458073 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Max (/Max) 2024-06-28 10:26:23.1461637 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Shape (/Shape_2) 2024-06-28 10:26:23.1465290 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
ConstantOfShape (/ConstantOfShape) 2024-06-28 10:26:23.1469457 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Mul (/Mul) 2024-06-28 10:26:23.1473044 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Concat (/Concat_1) 2024-06-28 10:26:23.1476842 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Add (/Add) 2024-06-28 10:26:23.1480471 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Div (/Div_1) 2024-06-28 10:26:23.1484923 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Slice (/whisper_decoder_init/model/decoder/embed_positions/Slice) 2024-06-28 10:26:23.1491729 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Gather (/whisper_decoder_init/model/decoder/embed_tokens/Gather) 2024-06-28 10:26:23.1497169 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Add (/whisper_decoder_init/model/decoder/Add_2) 2024-06-28 10:26:23.1501753 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
LayerNormalization (/whisper_decoder_init/model/decoder/laye rs.0/self_attn_layer_norm/LayerNormalization) 2024-06-28 10:26:23.1507956 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Shape (/whisper_decoder_init/model/decoder/Shape_1) 2024-06-28 10:26:23.1512633 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
ConstantOfShape (/whisper_decoder_init/model/decoder/ConstantOfShape) 2024-06-28 10:26:23.1517666 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Shape (/whisper_decoder_init/model/decoder/Shape_3) 2024-06-28 10:26:23.1522255 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Range (/whisper_decoder_init/model/decoder/Range) 2024-06-28 10:26:23.1526776 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Add (/whisper_decoder_init/model/decoder/Add) 2024-06-28 10:26:23.1531225 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Reshape (/whisper_decoder_init/model/decoder/Reshape_1) 2024-06-28 10:26:23.1535912 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Less (/whisper_decoder_init/model/decoder/Less) 2024-06-28 10:26:23.1540451 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Where (/whisper_decoder_init/model/decoder/Where) 2024-06-28 10:26:23.1545029 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Unsqueeze (/whisper_decoder_init/model/decoder/Unsqueeze_4) 2024-06-28 10:26:23.1550009 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Unsqueeze (/whisper_decoder_init/model/decoder/Unsqueeze_5) 2024-06-28 10:26:23.1556991 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Expand (/whisper_decoder_init/model/decoder/Expand) 2024-06-28 10:26:23.1563268 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Conv (/whisper_encoder/encoder/conv1/Conv) 2024-06-28 10:26:23.1567895 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Gelu () 2024-06-28 10:26:23.1571393 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Concat (Concat_0) 2024-06-28 10:26:23.1575103 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_decoder_init/model/decoder/layers.0/self_attn/out_proj/Ma tMul) 2024-06-28 10:26:23.1580564 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Add (/whisper _decoder_init/model/decoder/layers.0/self_attn/out_proj/Add) 2024-06-28 10:26:23.1585766 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Conv (/whisper_encoder/encoder/conv2/Conv) 2024-06-28 10:26:23.1590622 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Gelu () 2024-06-28 10:26:23.1594239 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_0) 2024-06-28 10:26:23.1598573 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Gather (Gather_0) 2024-06-28 10:26:23.1602973 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Gather (Gather_1) 2024-06-28 10:26:23.1606813 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Transpose (/whisper_encoder/encoder/Transpose) 2024-06-28 10:26:23.1611368 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Add (/whisper_encoder/encoder/Add_2) 2024-06-28 10:26:23.1615592 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
LayerNormalization (/whisper_encoder/encoder/layers.0/self_a ttn_layer_norm/LayerNormalization) 2024-06-28 10:26:23.1622358 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_decoder_init/model/decoder/layers.0/encoder_attn/q_proj/M atMul) 2024-06-28 10:26:23.1629866 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Attention (Attention_1) 2024-06-28 10:26:23.1633890 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_encoder/encoder/layers.0/self_attn/out_proj/MatMul) 2024-06-28 10:26:23.1639117 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Add (/whisper_encoder/encoder/layers.0/self_attn/out_proj/Add) 2024-06-28 10:26:23.1644091 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_1) 2024-06-28 10:26:23.1648431 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whisper_encoder/encoder/layers.0/fc1/MatMul) 2024-06-28 10:26:23.1653110 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] BiasGelu (Gelu_AddBias_0) 2024-06-28 10:26:23.1657151 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whisper_encoder/encoder/layers.0/fc2/MatMul) 2024-06-28 10:26:23.1661963 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_0) 2024-06-28 10:26:23.1666739 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Attention (Attention_2) 2024-06-28 10:26:23.1670636 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_encoder/encoder/layers.1/self_attn/out_proj/MatMul) 2024-06-28 10:26:23.1675757 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_1) 2024-06-28 10:26:23.1680289 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whisper_encoder/encoder/layers.1/fc1/MatMul) 2024-06-28 10:26:23.1684966 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] BiasGelu (Gelu_AddBias_1) 2024-06-28 10:26:23.1688859 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whisper_encoder/encoder/layers.1/fc2/MatMul) 2024-06-28 10:26:23.1694409 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_2) 2024-06-28 10:26:23.1700821 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Attention (Attention_3) 2024-06-28 10:26:23.1704835 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_encoder/encoder/layers.2/self_attn/out_proj/MatMul) 2024-06-28 10:26:23.1710114 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_3) 2024-06-28 10:26:23.1714686 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whisper_encoder/encoder/layers.2/fc1/MatMul) 2024-06-28 10:26:23.1719315 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] BiasGelu (Gelu_AddBias_2) 2024-06-28 10:26:23.1723229 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whisper_encoder/encoder/layers.2/fc2/MatMul) 2024-06-28 10:26:23.1727871 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_4) 2024-06-28 10:26:23.1732403 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Attention (Attention_4) 2024-06-28 10:26:23.1736278 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_encoder/encoder/layers.3/self_attn/out_proj/MatMul) 2024-06-28 10:26:23.1741686 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_5) 2024-06-28 10:26:23.1746487 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whisper_encoder/encoder/layers.3/fc1/MatMul) 2024-06-28 10:26:23.1751579 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] BiasGelu (Gelu_AddBias_3) 2024-06-28 10:26:23.1755515 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whisper_encoder/encoder/layers.3/fc2/MatMul) 2024-06-28 10:26:23.1760944 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_6) 2024-06-28 10:26:23.1767322 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_decoder_init/model/decoder/layers.0/encoder_attn/k_proj/M atMul) 2024-06-28 10:26:23.1774762 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_decoder_init/model/decoder/layers.0/encoder_attn/v_proj/M atMul) 2024-06-28 10:26:23.1780606 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_decoder_init/model/decoder/layers.1/encoder_attn/k_proj/M atMul) 2024-06-28 10:26:23.1786197 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_decoder_init/model/decoder/layers.1/encoder_attn/v_proj/M atMul) 2024-06-28 10:26:23.1791697 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_decoder_init/model/decoder/layers.2/encoder_attn/k_proj/M atMul) 2024-06-28 10:26:23.1797245 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_decoder_init/model/decoder/layers.2/encoder_attn/v_proj/M atMul) 2024-06-28 10:26:23.1802730 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_decoder_init/model/decoder/layers.3/encoder_attn/k_proj/M atMul) 2024-06-28 10:26:23.1808251 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_decoder_init/model/decoder/layers.3/encoder_attn/v_proj/M atMul) 2024-06-28 10:26:23.1813824 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
MultiHeadAttention (Attention_5) 2024-06-28 10:26:23.1817947 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_decoder_init/model/decoder/layers.0/encoder_attn/out_proj /MatMul) 2024-06-28 10:26:23.1823566 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_7) 2024-06-28 10:26:23.1828166 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whisper_decoder_init/model/decoder/layers.0/fc1/MatMul) 2024-06-28 10:26:23.1833186 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] BiasGelu (Gelu_AddBias_4) 2024-06-28 10:26:23.1837401 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whisper_decoder_init/model/decoder/layers.0/fc2/MatMul) 2024-06-28 10:26:23.1842804 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_8) 2024-06-28 10:26:23.1847381 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_decoder_init/model/decoder/layers.1/self_attn/out_proj/Ma tMul) 2024-06-28 10:26:23.1852829 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Gather (Gather_2) 2024-06-28 10:26:23.1856497 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Gather (Gather_3) 2024-06-28 10:26:23.1860162 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_9) 2024-06-28 10:26:23.1864680 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_decoder_init/model/decoder/layers.1/encoder_attn/q_proj/M atMul) 2024-06-28 10:26:23.1870125 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
MultiHeadAttention (Attention_7) 2024-06-28 10:26:23.1874239 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_decoder_init/model/decoder/layers.1/encoder_attn/out_proj /MatMul) 2024-06-28 10:26:23.1879743 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_10) 2024-06-28 10:26:23.1884298 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whisper_decoder_init/model/decoder/layers.1/fc1/MatMul) 2024-06-28 10:26:23.1889371 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] BiasGelu (Gelu_AddBias_5) 2024-06-28 10:26:23.1893356 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whisper_decoder_init/model/decoder/layers.1/fc2/MatMul) 2024-06-28 10:26:23.1898387 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_11) 2024-06-28 10:26:23.1902990 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_decoder_init/model/decoder/layers.2/self_attn/out_proj/Ma tMul) 2024-06-28 10:26:23.1908439 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Gather (Gather_4) 2024-06-28 10:26:23.1912165 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Gather (Gather_5) 2024-06-28 10:26:23.1915849 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_12) 2024-06-28 10:26:23.1920711 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_decoder_init/model/decoder/layers.2/encoder_attn/q_proj/M atMul) 2024-06-28 10:26:23.1926179 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
MultiHeadAttention (Attention_9) 2024-06-28 10:26:23.1930293 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_decoder_init/model/decoder/layers.2/encoder_attn/out_proj /MatMul) 2024-06-28 10:26:23.1935771 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_13) 2024-06-28 10:26:23.1940399 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whisper_decoder_init/model/decoder/layers.2/fc1/MatMul) 2024-06-28 10:26:23.1945414 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] BiasGelu (Gelu_AddBias_6) 2024-06-28 10:26:23.1949382 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whisper_decoder_init/model/decoder/layers.2/fc2/MatMul) 2024-06-28 10:26:23.1954416 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_14) 2024-06-28 10:26:23.1959050 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_decoder_init/model/decoder/layers.3/self_attn/out_proj/Ma tMul) 2024-06-28 10:26:23.1964492 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Gather (Gather_6) 2024-06-28 10:26:23.1968316 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Gather (Gather_7) 2024-06-28 10:26:23.1972013 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_15) 2024-06-28 10:26:23.1976655 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_decoder_init/model/decoder/layers.3/encoder_attn/q_proj/M atMul) 2024-06-28 10:26:23.1982068 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
MultiHeadAttention (Attention_11) 2024-06-28 10:26:23.1986177 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whis per_decoder_init/model/decoder/layers.3/encoder_attn/out_proj /MatMul) 2024-06-28 10:26:23.1991661 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_16) 2024-06-28 10:26:23.1996215 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whisper_decoder_init/model/decoder/layers.3/fc1/MatMul) 2024-06-28 10:26:23.2001268 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] BiasGelu (Gelu_AddBias_7) 2024-06-28 10:26:23.2005271 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whisper_decoder_init/model/decoder/layers.3/fc2/MatMul) 2024-06-28 10:26:23.2010346 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_17) 2024-06-28 10:26:23.2014939 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/whisper_decoder_init/proj_out/MatMul) 2024-06-28 10:26:23.2019478 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
MemcpyFromHost (Memcpy) 2024-06-28 10:26:23.2023360 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
MemcpyFromHost (Memcpy_token_2) 2024-06-28 10:26:23.2027499 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
MemcpyFromHost (Memcpy_token_3) 2024-06-28 10:26:23.2031630 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
MemcpyFromHost (Memcpy_token_4) 2024-06-28 10:26:23.2035741 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
MemcpyFromHost (Memcpy_token_5) 2024-06-28 10:26:23.2039922 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
MemcpyFromHost (Memcpy_token_6) 2024-06-28 10:26:23.2045923 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
MemcpyFromHost (Memcpy_token_7) 2024-06-28 10:26:23.2051814 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
MemcpyFromHost (Memcpy_token_8) 2024-06-28 10:26:23.2056003 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MemcpyToHost (Memcpy_token_9) 2024-06-28 10:26:23.2060101 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MemcpyToHost (Memcpy_token_10) 2024-06-28 10:26:23.2064236 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MemcpyToHost (Memcpy_token_11) 2024-06-28 10:26:23.2068406 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MemcpyToHost (Memcpy_token_12) 2024-06-28 10:26:23.2072531 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MemcpyToHost (Memcpy_token_13) 2024-06-28 10:26:23.2077327 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Shape (/decoder/model/decoder/embed_positions/Shape) 2024-06-28 10:26:23.2082326 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Shape (/decoder/model/decoder/Shape_1) 2024-06-28 10:26:23.2086707 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Slice (/decoder/model/decoder/embed_positions/Slice) 2024-06-28 10:26:23.2091362 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Gather (/decoder/model/decoder/embed_tokens/Gather) 2024-06-28 10:26:23.2096017 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Add (/decoder/model/decoder/Add) 2024-06-28 10:26:23.2100226 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
LayerNormalization (/decoder/model/decoder/layers.0/self_att n_layer_norm/LayerNormalization) 2024-06-28 10:26:23.2106466 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (MatMul_0) 2024-06-28 10:26:23.2110442 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Slice (Slice_0) 2024-06-28 10:26:23.2114076 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Slice (Slice_1) 2024-06-28 10:26:23.2117715 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Slice (Slice_2) 2024-06-28 10:26:23.2121325 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
MultiHeadAttention (Attention_0) 2024-06-28 10:26:23.2125400 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/decoder/model/decoder/layers.0/self_attn/out_proj/MatMul) 2024-06-28 10:26:23.2130444 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Add (/decoder/model/decoder/layers.0/self_attn/out_proj/Add) 2024-06-28 10:26:23.2135319 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_0) 2024-06-28 10:26:23.2139669 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/decoder/model/decoder/layers.0/encoder_attn/q_proj/MatMul) 2024-06-28 10:26:23.2144799 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
MultiHeadAttention (Attention_1) 2024-06-28 10:26:23.2149000 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/deco der/model/decoder/layers.0/encoder_attn/out_proj/MatMul) 2024-06-28 10:26:23.2154168 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_0) 2024-06-28 10:26:23.2158725 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/decoder/model/decoder/layers.0/fc1/MatMul) 2024-06-28 10:26:23.2163489 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] BiasGelu (Gelu_AddBias_0) 2024-06-28 10:26:23.2167457 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/decoder/model/decoder/layers.0/fc2/MatMul) 2024-06-28 10:26:23.2172159 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_1) 2024-06-28 10:26:23.2176888 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (MatMul_1) 2024-06-28 10:26:23.2181635 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Slice (Slice_3) 2024-06-28 10:26:23.2186554 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Slice (Slice_4) 2024-06-28 10:26:23.2190368 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Slice (Slice_5) 2024-06-28 10:26:23.2194025 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
MultiHeadAttention (Attention_2) 2024-06-28 10:26:23.2198149 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/decoder/model/decoder/layers.1/self_attn/out_proj/MatMul) 2024-06-28 10:26:23.2203190 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_2) 2024-06-28 10:26:23.2207772 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/decoder/model/decoder/layers.1/encoder_attn/q_proj/MatMul) 2024-06-28 10:26:23.2212832 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
MultiHeadAttention (Attention_3) 2024-06-28 10:26:23.2216963 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/deco der/model/decoder/layers.1/encoder_attn/out_proj/MatMul) 2024-06-28 10:26:23.2222077 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_3) 2024-06-28 10:26:23.2226618 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/decoder/model/decoder/layers.1/fc1/MatMul) 2024-06-28 10:26:23.2231262 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] BiasGelu (Gelu_AddBias_1) 2024-06-28 10:26:23.2235589 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/decoder/model/decoder/layers.1/fc2/MatMul) 2024-06-28 10:26:23.2241888 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_4) 2024-06-28 10:26:23.2246620 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (MatMul_2) 2024-06-28 10:26:23.2250312 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Slice (Slice_6) 2024-06-28 10:26:23.2253929 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Slice (Slice_7) 2024-06-28 10:26:23.2257567 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Slice (Slice_8) 2024-06-28 10:26:23.2261196 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
MultiHeadAttention (Attention_4) 2024-06-28 10:26:23.2265298 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/decoder/model/decoder/layers.2/self_attn/out_proj/MatMul) 2024-06-28 10:26:23.2270403 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_5) 2024-06-28 10:26:23.2274963 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/decoder/model/decoder/layers.2/encoder_attn/q_proj/MatMul) 2024-06-28 10:26:23.2280069 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
MultiHeadAttention (Attention_5) 2024-06-28 10:26:23.2284188 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/deco der/model/decoder/layers.2/encoder_attn/out_proj/MatMul) 2024-06-28 10:26:23.2289334 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_6) 2024-06-28 10:26:23.2293886 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/decoder/model/decoder/layers.2/fc1/MatMul) 2024-06-28 10:26:23.2298622 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] BiasGelu (Gelu_AddBias_2) 2024-06-28 10:26:23.2302568 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/decoder/model/decoder/layers.2/fc2/MatMul) 2024-06-28 10:26:23.2307266 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_7) 2024-06-28 10:26:23.2311811 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (MatMul_3) 2024-06-28 10:26:23.2315496 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Slice (Slice_9) 2024-06-28 10:26:23.2319148 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Slice (Slice_10) 2024-06-28 10:26:23.2322807 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Slice (Slice_11) 2024-06-28 10:26:23.2326481 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
MultiHeadAttention (Attention_6) 2024-06-28 10:26:23.2330705 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/decoder/model/decoder/layers.3/self_attn/out_proj/MatMul) 2024-06-28 10:26:23.2335762 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_8) 2024-06-28 10:26:23.2340352 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/decoder/model/decoder/layers.3/encoder_attn/q_proj/MatMul) 2024-06-28 10:26:23.2345435 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
MultiHeadAttention (Attention_7) 2024-06-28 10:26:23.2349572 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/deco der/model/decoder/layers.3/encoder_attn/out_proj/MatMul) 2024-06-28 10:26:23.2354728 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_9) 2024-06-28 10:26:23.2359392 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/decoder/model/decoder/layers.3/fc1/MatMul) 2024-06-28 10:26:23.2364045 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] BiasGelu (Gelu_AddBias_3) 2024-06-28 10:26:23.2368033 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/decoder/model/decoder/layers.3/fc2/MatMul) 2024-06-28 10:26:23.2372667 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
SkipLayerNormalization (SkipLayerNorm_AddBias_10) 2024-06-28 10:26:23.2377344 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MatMul (/decoder/proj_out/MatMul) 2024-06-28 10:26:23.2381598 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Cast () 2024-06-28 10:26:23.2385227 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
MemcpyFromHost (Memcpy) 2024-06-28 10:26:23.2390246 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]
MemcpyFromHost (Memcpy_token_2) 2024-06-28 10:26:23.2394417 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MemcpyToHost (Memcpy_token_3) 2024-06-28 10:26:23.2400318 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp] MemcpyToHost (Memcpy_token_4) 2024-06-28 10:26:23.2405237 [W:onnxruntime:, session_state.cc:1166 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf. 2024-06-28 10:26:23.2427987 [V:onnxruntime:, session_state.cc:126 onnxruntime::SessionState::CreateGraphInfo] SaveMLValueNameIndexMapping 2024-06-28 10:26:23.2432342 [V:onnxruntime:, session_state.cc:172 onnxruntime::SessionState::CreateGraphInfo] Done saving OrtValue mappings. 2024-06-28 10:26:23.2436957 [I:onnxruntime:, allocation_planner.cc:2442 onnxruntime::IGraphPartitioner::CreateGraphPartitioner] Use DeviceBasedPartition as default 2024-06-28 10:26:23.2446559 [I:onnxruntime:, session_state_utils.cc:201 onnxruntime::session_state_utils::SaveInitializedTensors] Saving initialized tensors. 2024-06-28 10:26:23.4348038 [I:onnxruntime:, session_state_utils.cc:345 onnxruntime::session_state_utils::SaveInitializedTensors] Done saving initialized tensors 2024-06-28 10:26:23.5109783 [V:onnxruntime:, session_state.cc:126 onnxruntime::SessionState::CreateGraphInfo] SaveMLValueNameIndexMapping 2024-06-28 10:26:23.5122646 [V:onnxruntime:, session_state.cc:172 onnxruntime::SessionState::CreateGraphInfo] Done saving OrtValue mappings. 2024-06-28 10:26:23.5127809 [I:onnxruntime:, allocation_planner.cc:2442 onnxruntime::IGraphPartitioner::CreateGraphPartitioner] Use DeviceBasedPartition as default 2024-06-28 10:26:23.5167711 [I:onnxruntime:, session_state_utils.cc:201 onnxruntime::session_state_utils::SaveInitializedTensors] Saving initialized tensors. 2024-06-28 10:26:23.7221930 [I:onnxruntime:, session_state_utils.cc:345 onnxruntime::session_state_utils::SaveInitializedTensors] Done saving initialized tensors

WA225 commented 5 months ago

The test_transcription.py file tries to look for a file called "whisper_dml_fp32_gpu-dml_model.onnx", but the generated file is called "whisper_dml_fp32_gpu-cpu_model.onnx", so I tried modifying the test_transcription.py file to read the generated file instead, and it results in the following output:

C:\anaconda3\envs\olv-whisper\Lib\site-packages\huggingface_hub\file_download.py:1132: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True. warnings.warn( [2024-06-28 10:43:23,275] [WARNING] [ort_inference.py:164:set_provider_options] Specified provider 'DmlExecutionProvider' is not in available provider names.Available providers: 'AzureExecutionProvider, CPUExecutionProvider' C:\anaconda3\envs\olv-whisper\Lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py:69: UserWarning: Specified provider 'DmlExecutionProvider' is not in available provider names.Available providers: 'AzureExecutionProvider, CPUExecutionProvider' warnings.warn( Traceback (most recent call last): File "C:\anaconda3\envs\olv-whisper\Lib\site-packages\olive\model\handler\onnx.py", line 113, in prepare_session return get_ort_inference_session( ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\anaconda3\envs\olv-whisper\Lib\site-packages\olive\common\ort_inference.py", line 121, in get_ort_inference_session check_ort_fallback(session, providers) File "C:\anaconda3\envs\olv-whisper\Lib\site-packages\olive\common\ort_inference.py", line 218, in check_ort_fallback raise OrtSessionFallbackError( The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "C:\Olive-main\examples\whisper\test_transcription.py", line 132, in output_text = main() ^^^^^^ File "C:\Olive-main\examples\whisper\test_transcription.py", line 122, in main session = olive_model.prepare_session(None, device, [ep]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\anaconda3\envs\olv-whisper\Lib\site-packages\olive\model\handler\onnx.py", line 117, in prepare_session raise OliveEvaluationError(e) from e olive.exception.OliveEvaluationError: The onnxruntime fallback happens. DmlExecutionProvider is not in the session providers ['CPUExecutionProvider']. session._enable_fallback = True

WA225 commented 4 months ago

Hi @jambayk any updates on this matter?

jambayk commented 4 months ago

onnxruntime and onnxruntime-directml packages are not complementary packages and rather different "flavors" of the same package which support different EPs. You should only install one of these packages in an environment, which in your case would be onnxruntime-directml. It already supports the CPU EP so there is no need to install another packaged. It is also okay for some operators to not run on dml backend if the runtime decides CPU backend is faster for it.

can you uninstall both packages and re-install onnxruntime-directml? After that please run the test transcription script using the gpu-dml.

WA225 commented 4 months ago

I have realized that having both packages was wrong, however i cannot run the script using the gpu-dml as the corresponding model is not generated due to the silent fail happening. After some changes in the config file, I got it to not silently fail, but I am now facing this issue reported in https://github.com/microsoft/Olive/issues/1221. If I try to run the command python test_transcription.py --config whisper_dml_fp32.json, i get the following error, as no model corresponding to whisper_dml_fp32_gpu-dml_model.json is generated:

Traceback (most recent call last): File "C:\Olive-main\examples\whisper\test_transcription.py", line 133, in output_text = main() File "C:\Olive-main\examples\whisper\test_transcription.py", line 98, in main olive_model = ONNXModelHandler(**output_model_json["config"]) KeyError: 'config'

xiaoyu-work commented 4 months ago

Can you pull the latest changes from main branch and try again?

WA225 commented 4 months ago

I cloned the main branch again and ran it and I still get the [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for DecoderMaskedMultiHeadAttention(1) node with name 'Attention_0' error when I run InsertBeamSearch without "use_gpu": false and it still fails silently when I set "use_gpu": false. The problem persists.

heibaidaolx123 commented 4 months ago

Hi guys, any update?

WA225 commented 3 months ago

Hi @jambayk @xiaoyu-work any updates on adding dml support to whisper?