Closed 0BAB1 closed 2 months ago
By removing the quant identity layer, things turned out fine.
Are we suppose to do this input quant via manual pre-processing ?
Please let me know if so (or if i misunderstood something, which is veeeery likeky).
best regards.
Hi @0BAB1 ,
From an initial look, your observation is right. I assume that you try to do the input quantization with a MultiThresholding layer and so the input to that layer is floating-point. We don't have the support for this scenario yet in FINN and assume an integer input to the accelerator, that is most likely why you were not able to convert that layer. This is a scenario we're actively working on adding the support for, but for now the pre-processing would need to be done on the host. If you are working with image data, you might be able to do something similar like we do for the image classification networks. You can check out the advanced builder settings tutorial for details (in the custom step section).
Hi @0BAB1 ,
From an initial look, your observation is right. I assume that you try to do the input quantization with a MultiThresholding layer and so the input to that layer is floating-point. We don't have the support for this scenario yet in FINN and assume an integer input to the accelerator, that is most likely why you were not able to convert that layer. This is a scenario we're actively working on adding the support for, but for now the pre-processing would need to be done on the host. If you are working with image data, you might be able to do something similar like we do for the image classification networks. You can check out the advanced builder settings tutorial for details (in the custom step section).
Hello @auphelia , thanks for the response !
I see...
So does the model only expect to run on integers ? Meaning We have to fully-quantize models in brevitas (as opposed to only quantizing the weights / inputs) ?
Hello @auphelia ,
After reading the FINN and FINN-R papers, i noticed this paragraph :
So if i use FINN in the case where i trained my model using FP32 data, Will it automatically apply transformations so i can feed the same data but quantized in INT8 ? Or should i train my model with INT8 out of the gate in Brevitas ?
I am confused out these mixed data types usages and struggles to find resources talking about this (and also how is the quant-dequant handled for partially quantized models ? What datatype flow though the model between layers : FP or INT ?)
I am trying to "deepen" my understanding of this as I am implementing on a non-supported Zynq board in Vivado/Vitis, meaning I have to feed correct datatypes during training AND inference for this to work as expected.
Best regard
update : After inference, the output does not match the labels at all, I will try some things out
Update 2 :
Added preprocessing baked into the model as so :
from finn.util.pytorch import ToTensor
from qonnx.transformation.merge_onnx_models import MergeONNXModels
from qonnx.core.datatype import DataType
from brevitas.export import export_qonnx
from qonnx.util.cleanup import cleanup as qonnx_cleanup
import torch
from finn.transformation.qonnx.convert_qonnx_to_finn import ConvertQONNXtoFINN
# PRE PROC : NONE
model = ModelWrapper("/tmp/finn_dev_rootmin/tidy.onnx")
global_inp_name = model.graph.input[0].name
ishape = model.get_tensor_shape(global_inp_name)
# preprocessing: torchvision's ToTensor divides uint8 inputs by 255
totensor_pyt = ToTensor()
export_qonnx(totensor_pyt, torch.randn(ishape), "/tmp/finn_dev_rootmin/preproc.onnx")
qonnx_cleanup("/tmp/finn_dev_rootmin/preproc.onnx", out_file="/tmp/finn_dev_rootmin/preproc.onnx")
pre_model = ModelWrapper("/tmp/finn_dev_rootmin/preproc.onnx")
pre_model = pre_model.transform(ConvertQONNXtoFINN())
# join preprocessing and core model
model = model.transform(MergeONNXModels(pre_model))
# add input quantization annotation: UINT8 for all BNN-PYNQ models
global_inp_name = model.graph.input[0].name
model.set_tensor_datatype(global_inp_name, DataType["UINT8"])
model.save("/tmp/finn_dev_rootmin/full_preproc.onnx")
showInNetron("/tmp/finn_dev_rootmin/full_preproc.onnx")
inspired by : https://github.com/Xilinx/finn/blob/main/notebooks/end2end_example/bnn-pynq/tfc_end2end_example.ipynb
(adding preprocessing section) I will keep you updated on how it works. I'm doing this as my C programs can only send UINT8 values as this is all the stitched IP accepts.
After running the inference on Zynq PL, it turns out the output is pretty much random... I am now pretty out of ideas on where this whole thing goes wrong ?
Here is my vivado block design if relevant :
It turns out it runs really well, after inspection on ILA, everything goes as expected, it's just the output are really not the ones expected. (~10% accuracy, meaning it's just some lucky guesses haha).
Anyway, looking forward to reading your insights on this situation, Best regards and have a good rest of your day.
Hello again,
After looking through examples, i used python verification in order to assess my model behavior and accuracy at each step of the hardware layer conversion process. This allowed me to make modification to correct the model, making the inference work at last.
Here is the notebook that really helped me : https://github.com/Xilinx/finn/blob/main/notebooks/end2end_example/bnn-pynq/tfc_end2end_verification.ipynb
Even though verification might seem like the "not funny" part, overlooking it cost me hours ! Don't try this at home !
Hi @0BAB1, Really happy to hear that! You might also want to have a look at this: we have the verification integrated into the builder abstraction as well, this notebook show this in one of the sections: https://github.com/Xilinx/finn/blob/dev/notebooks/advanced/4_advanced_builder_settings.ipynb
Quick summary
running this code
On this model does not convert MultiThreshold into Thresholding layer :
Things I tried :
running
model = model.transform(to_hw.InferThresholdingLayer())
transformation before all others, all multi thresholds in the model were indeed converted to hw layers, except the problematic first one...here are the first MultiThreshold layer's attributes :
My model
As I'm still trying to figure thing out and tinkering around, here is the simple model I'm trying this on :
It is fully quantized, my other "not fully quantized" (only weights) models ran better, maybe the quantIdentity quantizer is the problem ? Idk