Xilinx / Vitis-AI

Vitis AI is Xilinx’s development stack for AI inference on Xilinx hardware platforms, including both edge devices and Alveo cards.
https://www.xilinx.com/ai
Apache License 2.0
1.46k stars 630 forks source link

Partial Quantization #1456

Open tobgen179 opened 1 month ago

tobgen179 commented 1 month ago

Hello together,

I want to use partial quantization in my model as I have some special form of preprocessing included into the model. I found the documentation in UG1414 but there is some information missing.

I have following model:

def forward(self, x):

       x = self.preprocessing_layer(x)

       x = QuantStub()

       x = self.layer1(x)
       x = self.layer2(x)
       ...

       x = DeQuantStub()

       return x

After compilation I can observe that the model is represented in an .xmodel starting from QuantStub(). The documentation stops at this step. How can I now include the preprocessing_layer(x) into deployment? Is there a way to do this on the hardware directly using graph runner? Do I have to register the preprocessing_layer() as a custom operator and included it in the quantization (but I would really like to use QAT)?

Thank you in advance!