analogdevicesinc / ai8x-synthesis

Quantization and Synthesis (Device Specific Code Generation) for ADI's MAX78000 and MAX78002 Edge AI Devices
Apache License 2.0
55 stars 49 forks source link

Synthesis accuracy issues #298

Closed isztldav closed 1 year ago

isztldav commented 1 year ago

Issue: for the same input in Q8 mode [-127;128]: ai8x.set_device(87, True, False) (Pytorch prediction) =/= Synthesis (KAT)

Pytorch step

args = Args(act_mode_8bit=True) ai8x.set_device(87, True, False) # True to simulate device checkpoint = torch.load('qat_best_q8.pth.tar') state_dict = checkpoint['state_dict'] ai8x.fuse_bn_layers(model) model.load_state_dict(state_dict, strict=True) model = model.to(device)


- This code succesfully simulates the device with full 8bit integers and the output is in range [-128;127] and the accuracy **is** maintained.

## Synthesis step
I observed that the synthesis step does not generate the KAT by using Pytorch, but it runs the input through a custom code. This code generates the final output, which is then computed on the device (KAT) and this test is passed successfully. Therefore the device computes exactly what is expected.

## Issue encountered
The main problem arises when analyzing the KAT output. The synthesis is producing an output that is by factors worse than the output (prediction) of PyTorch in Q8 mode (for the same input).  

Was something similar ever observed?
I can provide more information if needed. Thank you for your help.
nikky4D commented 1 year ago

Were you able to resolve this issue? I'm noticing something similar in object detection models

MaximGorkem commented 1 year ago

Hi,

Thanks for reporting the issue. Could you please give us some clarifications?

Do you notice pixel class prediction differences between simulate.py and PyTorch?

How do you test the PyTorch operation? Are you running the QAT model? Could you let me know if you follow the steps below for QAT models in your PyTorch test?

# Fuse the BN parameters into conv layers before Quantization Aware Training (QAT)
ai8x.fuse_bn_layers(model)

# Switch model from unquantized to quantized for QAT
ai8x.initiate_qat(model, qat_policy)

model = apputils.load_lean_checkpoint(model, checkpoint_path, model_device=device)
ai8x.update_model(model)
isztldav commented 1 year ago

Thank you for your reply @MaximGorkem .

In short

This is the expected output according to PyTorch in 8 bit mode:

expected_predicted_pytorch_8bit

Instead, the "sampleoutput.h" looks like this: sampleoutput

In depth

All files required for you to replicate my setup are in: report_files.zip

Content on the zip file:

Other notes:

When the code is built and tested on the device, this passes the KAT test.
So my feeling is that somewhere the accuracy is lost and thus the segmentation doesn't match that of the PyTorch in 8bit mode. This makes pretty much the device not usable as I cannot obtain the expected accuracy.

ermanok commented 1 year ago

Hi,

Thanks for sharing the resources about the issue. I went over them and it seems the issue is due to a minor problem in the yaml file. At Layer 7 (up3_3), the input sequence is given as [6, 3] but at the model definition (in max78_eval_unet.ipynb) the input of up3_3 defined as 'torch.cat((conv2_2, up3), dim=1)'. Therefore, the line 81 of the yaml file should be changed as 'in_sequences: [3, 6]'.

Note that, this change is not enough as the 'output_processors' fields of Layer 3 and Layer 6 must be swapped and 'processors' field of the Layer 4 must be changed accordingly.

isztldav commented 1 year ago

Wow thank you so much for finding this! Indeed now it works. Thank you!