gizatechxyz / orion

ONNX Runtime in Cairo 1.0 for verifiable ML inference using STARK
https://orion.gizatech.xyz
MIT License
163 stars 81 forks source link

feat: qlinear conv #591

Closed chachaleo closed 7 months ago

chachaleo commented 8 months ago

Pull Request type

What is the current behavior:

Issue Number: #128

I have done the implementation in a similar way than the rest of the quantization operator and I have a small error with most of the test I made (max|error| = 1, but it can be a significant error on the type I8). Should I investigate further on this ?

Also, for each quantization operator we have X of quantized type Q and scale and zero point of dequantized type T. On ONNX they have X and zero point of type Q and scale of type T. Is it a choice of implementation or is it something that should be changed to be closer to the ONNX interface ? I can open an issue on filling the gap bitween the ONNX implementation of quantization operators and Orion implementation if you think there is a need for that :)