Xilinx / finn

Dataflow compiler for QNN inference on FPGAs
https://xilinx.github.io/finn
BSD 3-Clause "New" or "Revised" License
751 stars 242 forks source link

Error in parent_model = model.transform(CreateDataflowPartition() #936

Open Ba1tu3han opened 11 months ago

Ba1tu3han commented 11 months ago

Prerequisites

Please make sure to check off these prerequisites before submitting a bug report.

Quick summary

I trained the example model which is in Brevitas website with MNIST dataset. Then I imported it to FINN example "tfc-end2end-example" (in attachment). parent_model = model.transform(CreateDataflowPartition()) this line has an error in Creating a Dataflow Partion part. You can see the whole error in following screenshots.

I'm using current dev-branch.

Additional context

image image 4b_custom_model.zip image

Could you help me to solve it?

Best Regards,

iksnagreb commented 11 months ago

You probably have layers in the middle of your model which are not converted to HLS layers. Could you please share your ONNX graph right before running the CreateDataflowPartition transformation?

Briefly looking into your model graph (that is probably directly after export, right?), I suspect it might be due to exporting to QCDQ format instead of QONNX format. The QuantizeLinear and DequantizeLinear layers will not be turned into FINN MultiThresholds. If that is the case, you have two options: Either directly export to QONNX format via Brevitas, or you could try running the QCDQToQuant transformation from the QONNX package before doing the conversion to HLS layers.

Furthermore, as your model is a convolutional model, you should probably follow the cnv_end2end_example instead of the tfc_end2end_example, as the cnv variant contains steps to properly handle the convolution operations by lowering them to matmuls.

Ba1tu3han commented 11 months ago

Hello Dear Christoph,

Thank you for your reply.

I will share the ONNX file you mentioned when I access to my computer today.

I could not find how to export directly to QONNX via Brevitas. I only found export_onnx_qcdq() function in the Getting Started webpage of Brevitas. Could you share with me any document about the exporting?

Additionally, thank you for your guidance about example notebooks.

Have a good day!

iksnagreb commented 11 months ago

The QONNX export is from brevitas.export import export_qonnx, but you could just try the conversion pathway from QCDQ as well. You might also run into similar issues as the referenced #938, if you are using the bnn example notebooks for other than binarized models, as in your case 4-bit - not sure whether you have already adapted the notebook to account for this, just be cautious.

Ba1tu3han commented 11 months ago

I thought that I had a problem with my model and changed it with Brevitas FC then exported it with export_qonnx as you said. I directly imported the QONNX file to the example notebook.

I set width of weight, activation and initial to 1 bit as you recommend in #938, then it is solved.

model = model.transform(to_hls.InferQuantizedMatrixVectorActivation("decoupled")) it is solved for other combination except for 1w1a. But I have another problem.

AssertionError: IPGen failed: /tmp/finn_dev_ba/code_gen_ipgen_StreamingDataflowPartition_1_Thresholding_Batch_0_hifo_j1w/project_StreamingDataflowPartition_1_Thresholding_Batch_0/sol1/impl/ip not found. Check log under /tmp/finn_dev_ba/code_gen_ipgen_StreamingDataflowPartition_1_Thresholding_Batch_0_hifo_j1w

in

model = model.transform(ZynqBuild(platform = pynq_board, period_ns = target_clk_ns))

image image

QONNX file: QONNX_FC.zip Torch model: QONNX_FC.zip Python Code: custom-tfc_end2end_example 18.12.2023.zip

Additionally, while I am exporting the QONNX file just after I trained it, I get this warning:

Warning: The shape inference of onnx.brevitas::BipolarQuant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.

image

Thank you,

iksnagreb commented 11 months ago

Regarding the failing IPGen, you should check the log under the specified path to see what is actually going on. There should be some lines with ERROR: towards the end, which typically tell you exactly what is wrong.

Regarding the seemingly missing shape inference for BipolarQuant: I think it is safe to ignore this. It is probably just an issue during export (when PyTorch is still involved), but as soon as you are relying solely on QONNX, the BipolarQuant should be fully supported by shape inference, if I recall correctly. So, as long as you do not see proper shape inference errors later, just ignore it.

Ba1tu3han commented 2 months ago

The QONNX export is from brevitas.export import export_qonnx, but you could just try the conversion pathway from QCDQ as well. You might also run into similar issues as the referenced #938, if you are using the bnn example notebooks for other than binarized models, as in your case 4-bit - not sure whether you have already adapted the notebook to account for this, just be cautious.

Hello @iksnagreb

Could you give me about adapting the notebook for other than binarized model? I haven't done it yet.

iksnagreb commented 2 months ago

As mentioned in the referenced discussion #938, you need to replace (or add if you have a mixture of binarized and non-binarized operations) the InferBinaryMatrixVectorActivation by the InferQuantizedMatrixVectorActivation transformation. The ConvertBipolarMatMulToXnorPopcount might show no effect anymore and you might have to change some of the streamlining transformations as well, but this depends on your particular model, so maybe leave this as it is for now.

If you have any concrete issues with this, please share your model graph and the error messages.

Ba1tu3han commented 2 months ago

Thank you for your quick reply.

I do not understand how to determine which transformation should I use. I read convert_to_hw_layers and look at my ONNX graph in Netron visualization. However, I could not understand how to select transformations. Could you give me a hint or a tutorial?

Does it matter using an unnecessary transformation in a flow? Does it affect the transformations in bad way?

Should I follow a specific order of transformations or just using a random order?

I can compile W2A2 with builder commands but in FPGA its accuracy drops from 72% (Brevitas/Pytorch validation accuracy, GPU) to %5. Building dataflow does not give any error. Does it have a general transformations in step_convert_to_hw? Should I also change order or steps of transformations? Or can I directly use any bitwith with builder?

Additionally, I use the original CNV network from Brevitas and I want to use a layer reduced by myself version that keeps the types of layers. For orginal CNV with W2A2, I can not compile correctly.

Bests,