microsoft / onnxruntime-training-examples

Examples for using ONNX Runtime for model training.
MIT License
310 stars 62 forks source link

[Help] Problem trying out the iOS example #161

Closed SichangHe closed 1 year ago

SichangHe commented 1 year ago

I am trying to run on_device_training/mobile/ios on an M1 Mac. I could not install onnxruntime-training.

Inspecting the verbose output of Pip, it seems there exists no .whl for ARM. I've tried Python 3.11 and 3.10.

Also, onnxruntime-training==1.16.0 specified in requirements.txt is not released, yet. Is this an internal version?

Alternatively, @vraspar, could you maybe compress the artifacts and upload them here so I can try out the iOS example?

Thanks!

SichangHe commented 1 year ago

Related #158.

baijumeswani commented 1 year ago

We will be releasing the python package for mac in ort 1.16 release (which should happen very soon).

In the meantime, you could consider getting the nightly python package using

pip install -i https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/ onnxruntime-training-cpu
SichangHe commented 1 year ago

Thank you for the link and instruction. However, it seems that the website requires authentication:

$ pip3 install -i https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/ onnxruntime-training-cpu
Looking in indexes: https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/
Collecting onnxruntime-training-cpu
  Downloading https://aiinfra.pkgs.visualstudio.com/2692857e-05ef-43b4-ba9c-ccf1c22c437c/_packaging/7982ae20-ed19-4a35-a362-a96ac99897b7/pypi/download/onnxruntime-training-cpu/1.16.dev20230828001/onnxruntime_training_cpu-1.16.0.dev20230828001-cp311-cp311-macosx_11_0_arm64.whl (8.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.0/8.0 MB 1.1 MB/s eta 0:00:00
User for aiinfra.pkgs.visualstudio.com:

After I just pressed , Pip went ahead and attempted to install some older versions.

WARNING: 401 Error, Credentials not correct for https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/cerberus/
INFO: pip is looking at multiple versions of onnxruntime-training-cpu to determine which version is compatible with other requirements. This could take a while.
  Downloading https://aiinfra.pkgs.visualstudio.com/2692857e-05ef-43b4-ba9c-ccf1c22c437c/_packaging/7982ae20-ed19-4a35-a362-a96ac99897b7/pypi/download/onnxruntime-training-cpu/1.16.dev20230825001/onnxruntime_training_cpu-1.16.0.dev20230825001-cp311-cp311-macosx_11_0_arm64.whl (8.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.0/8.0 MB 1.8 MB/s eta 0:00:00
  Downloading https://aiinfra.pkgs.visualstudio.com/2692857e-05ef-43b4-ba9c-ccf1c22c437c/_packaging/7982ae20-ed19-4a35-a362-a96ac99897b7/pypi/download/onnxruntime-training-cpu/1.16.dev20230824001/onnxruntime_training_cpu-1.16.0.dev20230824001-cp311-cp311-macosx_11_0_arm64.whl (8.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.0/8.0 MB 1.3 MB/s eta 0:00:00
  Downloading https://aiinfra.pkgs.visualstudio.com/2692857e-05ef-43b4-ba9c-ccf1c22c437c/_packaging/7982ae20-ed19-4a35-a362-a96ac99897b7/pypi/download/onnxruntime-training-cpu/1.16.dev20230822001/onnxruntime_training_cpu-1.16.0.dev20230822001-cp311-cp311-macosx_11_0_arm64.whl (7.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.9/7.9 MB 1.8 MB/s eta 0:00:00
ERROR: Cannot install onnxruntime-training-cpu==1.16.0.dev20230822001, onnxruntime-training-cpu==1.16.0.dev20230824001, onnxruntime-training-cpu==1.16.0.dev20230825001 and onnxruntime-training-cpu==1.16.0.dev20230828001 because these package versions have conflicting dependencies.

The conflict is caused by:
    onnxruntime-training-cpu 1.16.0.dev20230828001 depends on cerberus
    onnxruntime-training-cpu 1.16.0.dev20230825001 depends on cerberus
    onnxruntime-training-cpu 1.16.0.dev20230824001 depends on cerberus
    onnxruntime-training-cpu 1.16.0.dev20230822001 depends on cerberus

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

Thank you for your help.

SichangHe commented 1 year ago

I solved the above issue by manually installing from the .whl file.

SichangHe commented 1 year ago
After installing onnxruntime-training-cpu, I cannot import onnxruntime any more. ```sh $ python3 -m onnxruntime Traceback (most recent call last): File "", line 189, in _run_module_as_main File "", line 148, in _get_module_details File "", line 112, in _get_module_details File "/opt/homebrew/lib/python3.11/site-packages/onnxruntime/__init__.py", line 53, in from onnxruntime.capi import onnxruntime_validation File "/opt/homebrew/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_validation.py", line 145, in has_ortmodule, package_name, version, cuda_version = validate_build_package_info() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_validation.py", line 140, in validate_build_package_info raise import_ortmodule_exception File "/opt/homebrew/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_validation.py", line 70, in validate_build_package_info from onnxruntime.training.ortmodule import ORTModule # noqa: F401 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/__init__.py", line 21, in raise RuntimeError("ORTModule is not supported on this platform.") RuntimeError: ORTModule is not supported on this platform. ```
After I reinstalled onnxruntime, I still have errors. ```sh $ pip3 install --force-reinstall onnxruntime $ python3 artifacts_gen.py /opt/homebrew/lib/python3.11/site-packages/transformers/configuration_utils.py:380: UserWarning: Passing `gradient_checkpointing` to a config initialization is deprecated and will be removed in v5 Transformers. Using `model.gradient_checkpointing_enable()` instead, or if you are using the `Trainer` API, pass `gradient_checkpointing=True` in your `TrainingArguments`. warnings.warn( /opt/homebrew/lib/python3.11/site-packages/transformers/models/wav2vec2/modeling_wav2vec2.py:595: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if attn_weights.size() != (bsz * self.num_heads, tgt_len, src_len): /opt/homebrew/lib/python3.11/site-packages/transformers/models/wav2vec2/modeling_wav2vec2.py:634: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if attn_output.size() != (bsz * self.num_heads, tgt_len, self.head_dim): ================ Diagnostic Run torch.onnx.export version 2.0.1 ================ verbose: False, log level: Level.ERROR ======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ======================== Traceback (most recent call last): File "/Users/sichanghe/AFileFolder/microsoft--onnxruntime-training-examples/on_device_training/mobile/ios/artifacts_gen.py", line 18, in import onnxruntime.training.onnxblock as onnxblock File "/opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/__init__.py", line 6, in from onnxruntime.capi._pybind_state import ( ImportError: cannot import name 'PropagateCastOpsStrategy' from 'onnxruntime.capi._pybind_state' (/opt/homebrew/lib/python3.11/site-packages/onnxruntime/capi/_pybind_state.py) ```
And, if I force reinstall onnxruntime-training-cpu, I am back to the first problem. ```sh $ pip3 install --force-reinstall onnxruntime_training_cpu-1.16.0.dev20230824001-cp311-cp311-macosx_11_0_arm64.whl onnxruntime-training-cpu $ python3 artifacts_gen.py /opt/homebrew/lib/python3.11/site-packages/transformers/configuration_utils.py:380: UserWarning: Passing `gradient_checkpointing` to a config initialization is deprecated and will be removed in v5 Transformers. Using `model.gradient_checkpointing_enable()` instead, or if you are using the `Trainer` API, pass `gradient_checkpointing=True` in your `TrainingArguments`. warnings.warn( /opt/homebrew/lib/python3.11/site-packages/transformers/models/wav2vec2/modeling_wav2vec2.py:595: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if attn_weights.size() != (bsz * self.num_heads, tgt_len, src_len): /opt/homebrew/lib/python3.11/site-packages/transformers/models/wav2vec2/modeling_wav2vec2.py:634: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if attn_output.size() != (bsz * self.num_heads, tgt_len, self.head_dim): ================ Diagnostic Run torch.onnx.export version 2.0.1 ================ verbose: False, log level: Level.ERROR ======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ======================== Traceback (most recent call last): File "/Users/sichanghe/AFileFolder/microsoft--onnxruntime-training-examples/on_device_training/mobile/ios/artifacts_gen.py", line 18, in import onnxruntime.training.onnxblock as onnxblock File "/opt/homebrew/lib/python3.11/site-packages/onnxruntime/__init__.py", line 53, in from onnxruntime.capi import onnxruntime_validation File "/opt/homebrew/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_validation.py", line 145, in has_ortmodule, package_name, version, cuda_version = validate_build_package_info() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_validation.py", line 140, in validate_build_package_info raise import_ortmodule_exception File "/opt/homebrew/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_validation.py", line 70, in validate_build_package_info from onnxruntime.training.ortmodule import ORTModule # noqa: F401 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/__init__.py", line 21, in raise RuntimeError("ORTModule is not supported on this platform.") RuntimeError: ORTModule is not supported on this platform. ```
baijumeswani commented 1 year ago

After installing onnxruntime-training-cpu, I cannot import onnxruntime any more.

This seems to be a problem. I will investigate.

After I reinstalled onnxruntime, I still have errors.

onnxruntime-training-cpu and onnxruntime cannot both be installed together. I would recommend uninstall onnxruntime-training-cpu before installing onnxruntime.

And, if I force reinstall onnxruntime-training-cpu, I am back to the first problem.

Will investigate. In the mean time, if you're looking for the training utilities in onnxruntime, you could try to (not sure if it will work).

import onnxruntime.training.api as orttraining
from onnxruntime.training import artifacts
baijumeswani commented 1 year ago

For this issue:

$ pip3 install -i https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/ onnxruntime-training-cpu
Looking in indexes: https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/
Collecting onnxruntime-training-cpu
  Downloading https://aiinfra.pkgs.visualstudio.com/2692857e-05ef-43b4-ba9c-ccf1c22c437c/_packaging/7982ae20-ed19-4a35-a362-a96ac99897b7/pypi/download/onnxruntime-training-cpu/1.16.dev20230828001/onnxruntime_training_cpu-1.16.0.dev20230828001-cp311-cp311-macosx_11_0_arm64.whl (8.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.0/8.0 MB 1.1 MB/s eta 0:00:00
User for aiinfra.pkgs.visualstudio.com:

Please first install all dependencies from pip. And then install onnxruntime-training-cpu from the link I shared.

pip install cerberus flatbuffers h5py numpy onnx packaging protobuf sympy setuptools
pip install -i https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/ onnxruntime-training-cpu
SichangHe commented 1 year ago

@baijumeswani, thanks for the updated instructions. Unfortunately, I still cannot successfully run the generation script.

I first uninstalled onnxruntime. ```sh $ pip3 uninstall onnxruntime Found existing installation: onnxruntime 1.15.1 Uninstalling onnxruntime-1.15.1: Would remove: /opt/homebrew/bin/onnxruntime_test /opt/homebrew/lib/python3.11/site-packages/onnxruntime-1.15.1.dist-info/* /opt/homebrew/lib/python3.11/site-packages/onnxruntime/* Would not remove (might be manually added): /opt/homebrew/lib/python3.11/site-packages/onnxruntime/capi/build_and_package_info.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/capi/checkpointing_utils.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/capi/ort_trainer.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/capi/pt_patch.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/capi/training/training_session.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/quantization/matmul_weight4_quantizer.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/tools/ort_format_model/ort_flatbuffers_py/fbs/Checkpoint.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/tools/ort_format_model/ort_flatbuffers_py/fbs/FloatProperty.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/tools/ort_format_model/ort_flatbuffers_py/fbs/IntProperty.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/tools/ort_format_model/ort_flatbuffers_py/fbs/ModuleState.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/tools/ort_format_model/ort_flatbuffers_py/fbs/OptimizerGroup.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/tools/ort_format_model/ort_flatbuffers_py/fbs/ParameterOptimizerState.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/tools/ort_format_model/ort_flatbuffers_py/fbs/PropertyBag.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/tools/ort_format_model/ort_flatbuffers_py/fbs/StringProperty.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/_checkpoint_storage.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/_utils.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/amp/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/amp/loss_scaler.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/api/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/api/checkpoint_state.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/api/lr_scheduler.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/api/module.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/api/optimizer.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/artifacts.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/checkpoint.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/experimental/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/experimental/exporter.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/experimental/gradient_graph/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/experimental/gradient_graph/_gradient_graph_tools.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/model_desc_validation.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/onnxblock/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/onnxblock/_graph_utils.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/onnxblock/_training_graph_utils.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/onnxblock/blocks.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/onnxblock/checkpoint_utils.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/onnxblock/loss/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/onnxblock/loss/loss.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/onnxblock/model_accessor.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/onnxblock/onnxblock.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/onnxblock/optim/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/onnxblock/optim/optim.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/optim/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/optim/_apex_amp_modifier.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/optim/_ds_modifier.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/optim/_megatron_modifier.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/optim/_modifier.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/optim/_modifier_registry.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/optim/_multi_tensor_apply.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/optim/config.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/optim/fp16_optimizer.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/optim/fused_adam.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/optim/lr_scheduler.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ort_triton/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ort_triton/_cache.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ort_triton/_codegen.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ort_triton/_common.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ort_triton/_decompose.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ort_triton/_ir.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ort_triton/_lowering.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ort_triton/_op_config.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ort_triton/_sorted_graph.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ort_triton/_sympy_utils.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ort_triton/_utils.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ort_triton/kernel/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ort_triton/kernel/_mm.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ort_triton/kernel/_slice_scel.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ort_triton/triton_op_executor.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/_custom_autograd_function.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/_custom_autograd_function_exporter.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/_custom_autograd_function_runner.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/_custom_gradient_registry.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/_custom_op_symbolic_registry.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/_execution_agent.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/_fallback.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/_fallback_exceptions.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/_gradient_accumulation_manager.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/_graph_execution_interface.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/_graph_execution_manager.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/_graph_execution_manager_factory.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/_inference_manager.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/_io.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/_logger.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/_onnx_models.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/_runtime_inspector.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/_torch_module_factory.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/_torch_module_interface.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/_torch_module_ort.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/_torch_module_pytorch.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/_training_manager.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/_utils.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/experimental/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/experimental/hierarchical_ortmodule/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/experimental/hierarchical_ortmodule/_hierarchical_ortmodule.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/experimental/json_config/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/experimental/json_config/_load_config_from_json.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/graph_transformer_registry.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/options.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/ortmodule.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/torch_cpp_extensions/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/torch_cpp_extensions/cpu/aten_op_executor/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/torch_cpp_extensions/cpu/aten_op_executor/aten_op_executor.cc /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/torch_cpp_extensions/cpu/aten_op_executor/setup.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/torch_cpp_extensions/cpu/torch_interop_utils/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/torch_cpp_extensions/cpu/torch_interop_utils/setup.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/torch_cpp_extensions/cpu/torch_interop_utils/torch_interop_utils.cc /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/torch_cpp_extensions/cuda/fused_ops/fused_ops_frontend.cpp /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/torch_cpp_extensions/cuda/fused_ops/multi_tensor_adam.cu /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/torch_cpp_extensions/cuda/fused_ops/multi_tensor_apply.cuh /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/torch_cpp_extensions/cuda/fused_ops/multi_tensor_axpby_kernel.cu /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/torch_cpp_extensions/cuda/fused_ops/multi_tensor_l2norm_kernel.cu /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/torch_cpp_extensions/cuda/fused_ops/multi_tensor_scale_kernel.cu /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/torch_cpp_extensions/cuda/fused_ops/setup.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/torch_cpp_extensions/cuda/fused_ops/type_shim.h /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/torch_cpp_extensions/cuda/torch_gpu_allocator/setup.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/torch_cpp_extensions/cuda/torch_gpu_allocator/torch_gpu_allocator.cc /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/torch_cpp_extensions/install.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/orttrainer.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/orttrainer_options.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/postprocess.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/torchdynamo/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/torchdynamo/ort_backend.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/torchdynamo/register_backend.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/utils/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/utils/data/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/utils/data/sampler.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/utils/hooks/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/utils/hooks/_statistics_subscriber.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/utils/hooks/_subscriber_base.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/utils/hooks/_subscriber_manager.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/utils/hooks/merge_activation_summary.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/utils/torch_io_helper.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/bart/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/llama/__init__.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/llama/benchmark.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/llama/benchmark_all.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/llama/convert_to_onnx.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/llama/llama_inputs.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/llama/llama_parity.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/llama/quant_kv_dataloader.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/stable_diffusion/models.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/stable_diffusion/onnxruntime_cuda_txt2img.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/stable_diffusion/onnxruntime_tensorrt_txt2img.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/stable_diffusion/ort_optimizer.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/stable_diffusion/ort_utils.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/whisper/benchmark.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/whisper/benchmark_all.py Proceed (Y/n)? Y Successfully uninstalled onnxruntime-1.15.1 ```
Then, I first installed the dependencies and onnexruntime-training-cpu. ```sh $ pip3 install cerberus flatbuffers h5py numpy onnx packaging protobuf sympy setuptools Requirement already satisfied: cerberus in /opt/homebrew/lib/python3.11/site-packages (1.3.5) Requirement already satisfied: flatbuffers in /opt/homebrew/lib/python3.11/site-packages (23.5.26) Requirement already satisfied: h5py in /opt/homebrew/lib/python3.11/site-packages (3.9.0) Requirement already satisfied: numpy in /opt/homebrew/lib/python3.11/site-packages (1.25.2) Requirement already satisfied: onnx in /opt/homebrew/lib/python3.11/site-packages (1.14.1) Requirement already satisfied: packaging in /opt/homebrew/lib/python3.11/site-packages (23.1) Requirement already satisfied: protobuf in /opt/homebrew/lib/python3.11/site-packages (3.20.3) Requirement already satisfied: sympy in /opt/homebrew/lib/python3.11/site-packages (1.12) Requirement already satisfied: setuptools in /opt/homebrew/lib/python3.11/site-packages (68.1.2) Requirement already satisfied: typing-extensions>=3.6.2.1 in /opt/homebrew/lib/python3.11/site-packages (from onnx) (4.7.1) Requirement already satisfied: mpmath>=0.19 in /opt/homebrew/lib/python3.11/site-packages (from sympy) (1.3.0) $ pip3 install -i https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/ onnxruntime-training-cpu Looking in indexes: https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/ Requirement already satisfied: onnxruntime-training-cpu in /opt/homebrew/lib/python3.11/site-packages (1.16.0.dev20230824001) Requirement already satisfied: cerberus in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (1.3.5) Requirement already satisfied: flatbuffers in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (23.5.26) Requirement already satisfied: h5py in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (3.9.0) Requirement already satisfied: numpy>=1.16.6 in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (1.25.2) Requirement already satisfied: onnx in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (1.14.1) Requirement already satisfied: packaging in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (23.1) Requirement already satisfied: protobuf in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (3.20.3) Requirement already satisfied: sympy in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (1.12) Requirement already satisfied: setuptools>=41.4.0 in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (68.1.2) Requirement already satisfied: typing-extensions>=3.6.2.1 in /opt/homebrew/lib/python3.11/site-packages (from onnx->onnxruntime-training-cpu) (4.7.1) Requirement already satisfied: mpmath>=0.19 in /opt/homebrew/lib/python3.11/site-packages (from sympy->onnxruntime-training-cpu) (1.3.0) ```
Pip skipped the installation, so I uninstalled onnexruntime-training-cpu and reinstalled it. ```sh $ pip3 uninstall onnxruntime-training-cpu Found existing installation: onnxruntime-training-cpu 1.16.0.dev20230824001 Uninstalling onnxruntime-training-cpu-1.16.0.dev20230824001: Would remove: /opt/homebrew/lib/python3.11/site-packages/onnxruntime/capi/build_and_package_info.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/capi/checkpointing_utils.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/capi/ort_trainer.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/capi/pt_patch.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/capi/training/training_session.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/quantization/matmul_weight4_quantizer.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/tools/ort_format_model/ort_flatbuffers_py/fbs/Checkpoint.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/tools/ort_format_model/ort_flatbuffers_py/fbs/FloatProperty.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/tools/ort_format_model/ort_flatbuffers_py/fbs/IntProperty.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/tools/ort_format_model/ort_flatbuffers_py/fbs/ModuleState.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/tools/ort_format_model/ort_flatbuffers_py/fbs/OptimizerGroup.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/tools/ort_format_model/ort_flatbuffers_py/fbs/ParameterOptimizerState.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/tools/ort_format_model/ort_flatbuffers_py/fbs/PropertyBag.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/tools/ort_format_model/ort_flatbuffers_py/fbs/StringProperty.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/* /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/bart/* /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/llama/* /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/stable_diffusion/models.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/stable_diffusion/onnxruntime_cuda_txt2img.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/stable_diffusion/onnxruntime_tensorrt_txt2img.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/stable_diffusion/ort_optimizer.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/stable_diffusion/ort_utils.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/whisper/benchmark.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime/transformers/models/whisper/benchmark_all.py /opt/homebrew/lib/python3.11/site-packages/onnxruntime_training_cpu-1.16.0.dev20230824001.dist-info/* Proceed (Y/n)? Y Successfully uninstalled onnxruntime-training-cpu-1.16.0.dev20230824001 $ pip3 install cerberus flatbuffers h5py numpy onnx packaging protobuf sympy setuptools Requirement already satisfied: cerberus in /opt/homebrew/lib/python3.11/site-packages (1.3.5) Requirement already satisfied: flatbuffers in /opt/homebrew/lib/python3.11/site-packages (23.5.26) Requirement already satisfied: h5py in /opt/homebrew/lib/python3.11/site-packages (3.9.0) Requirement already satisfied: numpy in /opt/homebrew/lib/python3.11/site-packages (1.25.2) Requirement already satisfied: onnx in /opt/homebrew/lib/python3.11/site-packages (1.14.1) Requirement already satisfied: packaging in /opt/homebrew/lib/python3.11/site-packages (23.1) Requirement already satisfied: protobuf in /opt/homebrew/lib/python3.11/site-packages (3.20.3) Requirement already satisfied: sympy in /opt/homebrew/lib/python3.11/site-packages (1.12) Requirement already satisfied: setuptools in /opt/homebrew/lib/python3.11/site-packages (68.1.2) Requirement already satisfied: typing-extensions>=3.6.2.1 in /opt/homebrew/lib/python3.11/site-packages (from onnx) (4.7.1) Requirement already satisfied: mpmath>=0.19 in /opt/homebrew/lib/python3.11/site-packages (from sympy) (1.3.0) $ pip3 install -i https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/ onnxruntime-training-cpu Looking in indexes: https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/ Collecting onnxruntime-training-cpu Downloading https://aiinfra.pkgs.visualstudio.com/2692857e-05ef-43b4-ba9c-ccf1c22c437c/_packaging/7982ae20-ed19-4a35-a362-a96ac99897b7/pypi/download/onnxruntime-training-cpu/1.16.dev20230830001/onnxruntime_training_cpu-1.16.0.dev20230830001-cp311-cp311-macosx_11_0_arm64.whl (8.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.0/8.0 MB 73.3 kB/s eta 0:00:00 Requirement already satisfied: cerberus in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (1.3.5) Requirement already satisfied: flatbuffers in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (23.5.26) Requirement already satisfied: h5py in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (3.9.0) Requirement already satisfied: numpy>=1.16.6 in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (1.25.2) Requirement already satisfied: onnx in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (1.14.1) Requirement already satisfied: packaging in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (23.1) Requirement already satisfied: protobuf in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (3.20.3) Requirement already satisfied: sympy in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (1.12) Requirement already satisfied: setuptools>=41.4.0 in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (68.1.2) Requirement already satisfied: typing-extensions>=3.6.2.1 in /opt/homebrew/lib/python3.11/site-packages (from onnx->onnxruntime-training-cpu) (4.7.1) Requirement already satisfied: mpmath>=0.19 in /opt/homebrew/lib/python3.11/site-packages (from sympy->onnxruntime-training-cpu) (1.3.0) Installing collected packages: onnxruntime-training-cpu Successfully installed onnxruntime-training-cpu-1.16.0.dev20230830001 ```

But, when I ran the generation script again, it still failed with the exact same problem as before.

$ python3 artifacts_gen.py
/opt/homebrew/lib/python3.11/site-packages/transformers/configuration_utils.py:380: UserWarning: Passing `gradient_checkpointing` to a config initialization is deprecated and will be removed in v5 Transformers. Using `model.gradient_checkpointing_enable()` instead, or if you are using the `Trainer` API, pass `gradient_checkpointing=True` in your `TrainingArguments`.
  warnings.warn(
/opt/homebrew/lib/python3.11/site-packages/transformers/models/wav2vec2/modeling_wav2vec2.py:595: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_weights.size() != (bsz * self.num_heads, tgt_len, src_len):
/opt/homebrew/lib/python3.11/site-packages/transformers/models/wav2vec2/modeling_wav2vec2.py:634: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_output.size() != (bsz * self.num_heads, tgt_len, self.head_dim):
================ Diagnostic Run torch.onnx.export version 2.0.1 ================
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================

Traceback (most recent call last):
  File "/Users/sichanghe/AFileFolder/microsoft--onnxruntime-training-examples/on_device_training/mobile/ios/artifacts_gen.py", line 18, in <module>
    import onnxruntime.training.onnxblock as onnxblock
  File "/opt/homebrew/lib/python3.11/site-packages/onnxruntime/__init__.py", line 53, in <module>
    from onnxruntime.capi import onnxruntime_validation
  File "/opt/homebrew/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_validation.py", line 145, in <module>
    has_ortmodule, package_name, version, cuda_version = validate_build_package_info()
                                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_validation.py", line 140, in validate_build_package_info
    raise import_ortmodule_exception
  File "/opt/homebrew/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_validation.py", line 70, in validate_build_package_info
    from onnxruntime.training.ortmodule import ORTModule  # noqa: F401
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/onnxruntime/training/ortmodule/__init__.py", line 21, in <module>
    raise RuntimeError("ORTModule is not supported on this platform.")
RuntimeError: ORTModule is not supported on this platform.

For your information, @baijumeswani.

baijumeswani commented 1 year ago

I updated the code in https://github.com/microsoft/onnxruntime/pull/17380. Could you uninstall onnxruntime and onnxruntime-training-cpu and try to install the latest onnxruntime-training-cpu nightly using

pip install cerberus flatbuffers h5py numpy onnx packaging protobuf sympy setuptools pip install -i https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/ onnxruntime-training-cpu

SichangHe commented 1 year ago

It works. I successfully ran artifact_gen.py.

Full command output. ```sh $ pip3 install cerberus flatbuffers h5py numpy onnx packaging protobuf sympy setuptools Requirement already satisfied: cerberus in /opt/homebrew/lib/python3.11/site-packages (1.3.5) Requirement already satisfied: flatbuffers in /opt/homebrew/lib/python3.11/site-packages (23.5.26) Requirement already satisfied: h5py in /opt/homebrew/lib/python3.11/site-packages (3.9.0) Requirement already satisfied: numpy in /opt/homebrew/lib/python3.11/site-packages (1.25.2) Requirement already satisfied: onnx in /opt/homebrew/lib/python3.11/site-packages (1.14.1) Requirement already satisfied: packaging in /opt/homebrew/lib/python3.11/site-packages (23.1) Requirement already satisfied: protobuf in /opt/homebrew/lib/python3.11/site-packages (3.20.3) Requirement already satisfied: sympy in /opt/homebrew/lib/python3.11/site-packages (1.12) Requirement already satisfied: setuptools in /opt/homebrew/lib/python3.11/site-packages (68.1.2) Requirement already satisfied: typing-extensions>=3.6.2.1 in /opt/homebrew/lib/python3.11/site-packages (from onnx) (4.7.1) Requirement already satisfied: mpmath>=0.19 in /opt/homebrew/lib/python3.11/site-packages (from sympy) (1.3.0) $ pip3 install -i https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/ onnxruntime-training-cpu Looking in indexes: https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/ Collecting onnxruntime-training-cpu Downloading https://aiinfra.pkgs.visualstudio.com/2692857e-05ef-43b4-ba9c-ccf1c22c437c/_packaging/7982ae20-ed19-4a35-a362-a96ac99897b7/pypi/download/onnxruntime-training-cpu/1.16.dev20230904001/onnxruntime_training_cpu-1.16.0.dev20230904001-cp311-cp311-macosx_11_0_arm64.whl (8.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.0/8.0 MB 1.8 MB/s eta 0:00:00 Requirement already satisfied: cerberus in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (1.3.5) Requirement already satisfied: flatbuffers in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (23.5.26) Requirement already satisfied: h5py in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (3.9.0) Requirement already satisfied: numpy>=1.16.6 in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (1.25.2) Requirement already satisfied: onnx in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (1.14.1) Requirement already satisfied: packaging in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (23.1) Requirement already satisfied: protobuf in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (3.20.3) Requirement already satisfied: sympy in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (1.12) Requirement already satisfied: setuptools>=41.4.0 in /opt/homebrew/lib/python3.11/site-packages (from onnxruntime-training-cpu) (68.1.2) Requirement already satisfied: typing-extensions>=3.6.2.1 in /opt/homebrew/lib/python3.11/site-packages (from onnx->onnxruntime-training-cpu) (4.7.1) Requirement already satisfied: mpmath>=0.19 in /opt/homebrew/lib/python3.11/site-packages (from sympy->onnxruntime-training-cpu) (1.3.0) Installing collected packages: onnxruntime-training-cpu Successfully installed onnxruntime-training-cpu-1.16.0.dev20230904001 $ python3 artifacts_gen.py /opt/homebrew/lib/python3.11/site-packages/transformers/configuration_utils.py:380: UserWarning: Passing `gradient_checkpointing` to a config initialization is deprecated and will be removed in v5 Transformers. Using `model.gradient_checkpointing_enable()` instead, or if you are using the `Trainer` API, pass `gradient_checkpointing=True` in your `TrainingArguments`. warnings.warn( /opt/homebrew/lib/python3.11/site-packages/transformers/models/wav2vec2/modeling_wav2vec2.py:595: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if attn_weights.size() != (bsz * self.num_heads, tgt_len, src_len): /opt/homebrew/lib/python3.11/site-packages/transformers/models/wav2vec2/modeling_wav2vec2.py:634: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if attn_output.size() != (bsz * self.num_heads, tgt_len, self.head_dim): ================ Diagnostic Run torch.onnx.export version 2.0.1 ================ verbose: False, log level: Level.ERROR ======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ======================== 2023-09-06 12:50:08,600 root [INFO] - Custom loss block provided: CustomCELoss 2023-09-06 12:50:08,611 root [DEBUG] - Building training block _TrainingBlock 2023-09-06 12:50:08,611 root [DEBUG] - Building block: CustomCELoss 2023-09-06 12:50:08,611 root [DEBUG] - Building block: CrossEntropyLoss 2023-09-06 12:50:09,480 root [DEBUG] - Building gradient graph for training block _TrainingBlock 2023-09-06 12:50:09.613202 [I:onnxruntime:Default, constant_sharing.cc:256 ApplyImpl] Total shared scalar initializer count: 499 2023-09-06 12:50:09.620483 [I:onnxruntime:Default, graph.cc:3556 CleanUnusedInitializersAndNodeArgs] Removing initializer 'ortshared_1_0_1_5'. It is no longer used by any node. 2023-09-06 12:50:09.620505 [I:onnxruntime:Default, graph.cc:3556 CleanUnusedInitializersAndNodeArgs] Removing initializer 'ortshared_1_0_1_0'. It is no longer used by any node. 2023-09-06 12:50:09.626150 [I:onnxruntime:Default, graph.cc:3556 CleanUnusedInitializersAndNodeArgs] Removing initializer 'ortshared_1_0_1_4'. It is no longer used by any node. 2023-09-06 12:50:09.626160 [I:onnxruntime:Default, graph.cc:3556 CleanUnusedInitializersAndNodeArgs] Removing initializer 'ortshared_1_0_1_1'. It is no longer used by any node. 2023-09-06 12:50:09.626164 [I:onnxruntime:Default, graph.cc:3556 CleanUnusedInitializersAndNodeArgs] Removing initializer 'ortshared_1_0_1_2'. It is no longer used by any node. 2023-09-06 12:50:09,664 root [DEBUG] - The loss output is onnx::loss::2. The gradient graph will be built starting from onnx::loss::2_grad. 2023-09-06 12:50:09,720 root [DEBUG] - Adding gradient accumulation nodes for training block _TrainingBlock 2023-09-06 12:50:09,799 root [INFO] - Saved training model to MyVoice/artifacts/training_model.onnx 2023-09-06 12:50:09,838 root [INFO] - Saved eval model to MyVoice/artifacts/eval_model.onnx 2023-09-06 12:50:10,709 root [INFO] - Saved checkpoint to MyVoice/artifacts/checkpoint 2023-09-06 12:50:10,709 root [INFO] - Optimizer enum provided: AdamW 2023-09-06 12:50:10,709 root [DEBUG] - Building forward block AdamW 2023-09-06 12:50:10,710 root [DEBUG] - Building block: AdamWOptimizer 2023-09-06 12:50:10,711 root [INFO] - Saved optimizer model to MyVoice/artifacts/optimizer_model.onnx ```

Thank you, @baijumeswani! I will keep you updated with how trying out the app goes for me.