Discussion on the Correct Approach for Converting ONNX Models to Other Frameworks

kumarutkarsh1248 commented 1 month ago

I am working on creating an ONNX converter for the mlpack(a machine learning library) framework. My current ONNX-MLPack converter is working fine for some simple linear and convolutional models(https://github.com/kumarutkarsh1248/onnx_mlpack_translator ). Basically, what my converter does is iterate through the nodes of the graph in topological order, extract the attributes of each node, and while doing so, the converter adds layers to the MLPack model one by one with all their attributes mapped.

This overall approach works fine for simple models with no side branching, but this converter fails when it comes to ONNX models with some side branches associated with nodes or complex closed connectivity of nodes.

There is also a significant difference in the graph for different versions of the ONNX model.

I don’t know how to deal with these issues, and I can’t find any documentation or reference on how machine learning frameworks should make their converters. If you could provide any such developer documentation or any other resources, it would help me a lot.

xadupre commented 1 month ago

You can read this https://onnx.ai/onnx/intro/converters.html to find some inspiration. Most of the converting libraries includes the three following components:

a graph builder: a class storing the onnx nodes created so far. It also stors the unique names to avoid reusing a name... You can store here everything needed to debug later.
a list of functions converting unique pieces of model. For scikit-learn, it would be a function per model defined in scikit-learn, for torch, it is a function per torch aten function. Every of this function is adding nods into the graph builder.
an interpreter going through the pieces of the model to convert and calling the appropriate functions mentioned above.

This page gives more details about a dummy exporter I made to convert torch model into onnx: https://sdpython.github.io/doc/experimental-experiment/dev/design/exporter.html. It follows this design.

About opsets, you have two choices. Either a converting function is producing different sequences of nodes depending on the opset or there are two functions for two opsets and the interpreter is calling the right one based on the opset.

cbourjau commented 1 month ago

We have built our own internal converter library based on Spox (which we have also built). It proved to be very well maintainable over ~2 years now. We build complex graphs with thousands of nodes across various projects. Spox does not have any explicit containers holding your state - instead, you simply describe your inference code using lazy objects that build up the required graph in the background. It ends up looking almost like regular NumPy code. I'd recommend taking a look at the respective section of the docs: https://spox.readthedocs.io/en/latest/guides/converter.html

onnx / onnx

Discussion on the Correct Approach for Converting ONNX Models to Other Frameworks #6142