microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.07k stars 2.84k forks source link

[Training] [Windows] #19965

Open Positronx opened 5 months ago

Positronx commented 5 months ago

Describe the issue

OS : Windows 10 Is there a way to generate training artifacts in C++, without having to use python utilities? I took a look at the source code and I think that it is possible. I'm just having a hard time to link the necessary header files related to generating the artifacts.

To reproduce

#include "orttraining/training_api/checkpoint.h"

I can't even compile an empty code containing above header, even though I linked the .lib files and required headers. The error says that the file 'onnx/onnx..pb.h' can't be opened.

Urgency

No response

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

1.18.0

PyTorch Version

2.2.0

Execution Provider

Default CPU

Execution Provider Library Version

No response

baijumeswani commented 5 months ago

Is there a way to generate training artifacts in C++, without having to use python utilities? I took a look at the source code and I think that it is possible. I'm just having a hard time to link the necessary header files related to generating the artifacts.

Generating the training artifacts is currently not supported through C++ and requires usage of our python utilities.

May I ask why you would like to generate the training artifacts from c++?

Positronx commented 5 months ago

I have a graphical framework written in C that reads functions compiled into DLL files. I want to be able to generate the artifacts directly from the graphical framework without having the need to use a third party function from python. Unfortunately, I can't communicate with python and C++ is the best I can do since it is the closest thing to C.

eric-vision-e commented 5 months ago

Hi @Positronx

I'm looking for the same thing. Have you found a way to generate artifacts with C++?

Thanks

Positronx commented 5 months ago

Hello @eric-vision-e I managed to generate a checkpoint file using the function SaveCheckpoint defined in the file orttraining/training_api/checkpoint.h. Otherwise, I'm struggling to use the class OrtModuleGraphBuilder defined in orttraining/core/framework/ortmodule_graph_builder.h to generate the gradient graph (what I assume to be related to training_model.onnx).

eric-vision-e commented 5 months ago

Hi @Positronix,

Thanks for your answer. But as far as i understood checkpoint is only for weights values. For example if I have a classification model already deployed with 2 classes and for some reason I want to add another class. The only way is to export again the model in Python then retrain in C++ with new data (old classes + new one). Because in this case my model architecture has changed.

Am I correct? Have you understood how to handle this scenario using only C++?

Thanks

Positronx commented 5 months ago

Hi @eric-vision-e

The checkpoint is only for weight values (and other metadata like optimizers momentums but that doesn't concern me yet). If I understood your problem roughly, you want to change the training_model architecture without having to resort to python. What I'm looking for aligns with that. As far as I can tell, this will require (partial) rewriting of the python libraries that call onnxruntime_pybind11_state. Unfortunately, for the training model, it isn't as straightforward as the checkpoint file.

github-actions[bot] commented 4 months ago

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.