The rules for quantization seem to always exist on the model input and output tensors. For the hello world example we added methods you can call to quantize float32 to int8 and int8 to float32 but can we do this transform automatically?
In the microlite C code when we are getting and setting the value of a tensor we know both the tensor type and the micropython object type so could try to automatically quantize these values.
We might want to add a switch for this feature that can be set when the interpreter is created.
The rules for quantization seem to always exist on the model input and output tensors. For the hello world example we added methods you can call to quantize float32 to int8 and int8 to float32 but can we do this transform automatically?
In the microlite C code when we are getting and setting the value of a tensor we know both the tensor type and the micropython object type so could try to automatically quantize these values.
We might want to add a switch for this feature that can be set when the interpreter is created.