jafioti / luminal

Deep learning at the speed of light.
https://luminalai.com
Apache License 2.0
1.45k stars 90 forks source link

starting attempt at onnx support #49

Open skewballfox opened 5 months ago

skewballfox commented 5 months ago

Very much a draft, and code currently doesn't compile. Would address #32

I figured to kickstart onnx support I'd copy parts of burn's import library that would be needed here as well. I'm starting with graph_io given the logic will likely be exactly the same save for using luminal types (which admittedly I'm still learning).

Graph_io stores initializers and node input/output arguments, and maps the original names to some new name. At least with burn, it turned out better to handle each of those in one containing struct to deal with updating values and names for the associated nodes.

the logic will likely diverge a bit more once this gets to handling the nodes themselves, but the skeleton will likely be the same. This currently doesn't have any test data, I'm guessing that you want to do something similar to burn in that case and generate the test models python side, or did you have something else in mind for testing op support?

I'm hoping that most of this could split the shared logic into a new non-luminal/burn crate, as even if the logic for node handling is pretty different, there's a lot of shared boilerplate and code generation just getting to that point and testing the results.

jafioti commented 5 months ago

This is amazing work! I don't know too much about ONNX or the file format. My understanding is they're all the same format though, is there dialects or special versions? Is it all in protobuf?

If possible I'd prefer to not have any build-time commands running since they can often lead to failed builds (see CI)

To your knowledge are there any onnx parsing crates out there? Most of the ones I see are bundled with an inference engine (an obvious no go for us).

Once the file is read, I assume it's just a matter of running a function to convert the onnx nodes to luminal nodes.

Btw if your on the discord, DM me and we can chat about this. Great work!

skewballfox commented 5 months ago

My understanding is they're all the same format though, is there dialects or special versions? Is it all in protobuf?

Yeah, the ONNX spec specifies the operations that are supported (by onnx), and it's versioned so that they can update arguments over time. The current version of the supported list of ops is here. Occasionally the operators change and that can complicate both support and generating test data. As an example, the initial version of unsqueeze didn't take an argument, but the latest ones can take a vector of indexes where dimensions will be added (this is just syntactic sugar for reshape).

the Proto file is used to generate the code that can read the proto files into language objects. but these are basically just structs wrapping string data, alot of processing is still needed after that.

If possible I'd prefer to not have any build-time commands running since they can often lead to failed builds (see CI)

the current build errors are due to this code not being functional, there are some burn types that need to be replaced with their luminal equivalent. but due to the code generation, the buidtime commands are kind of necessary.

plus there are two approaches to take with the models, the first, the one burn took, was to generate the rust code from the models at build time, the second is runtime support which I think tract may have pulled off. not sure whether candle is buildtime, runtime, or both.

To your knowledge are there any onnx parsing crates out there? Most of the ones I see are bundled with an inference engine (an obvious no go for us).

no, unfortunately. but if you look at the code for burn, tract, and candle they all have basically the same structure and components. For burn, and I assume tract, the models are single ops that were generated in python for the purpose of testing support. followed by boilerplate rust code for testing the op. Each of them use prost or protobuf-codegen to build the filereader, then parsing the initializers, inputs and outputs, then the nodes, and slowly build the model using their ops, types , and tensor implementations.

Once the file is read, I assume it's just a matter of running a function to convert the onnx nodes to luminal nodes.

yeah, though it gets more complicated if your library requires explicit dimensions or your implementing runtime support.