Error on compiling a yolov4

Hi, I'm trying to compile a yolov4 model trained on a custom dataset. And I'm running into the following error:

**************************************************
* VITIS_AI Compilation - Xilinx Inc.
**************************************************
[INFO] Namespace(inputs_shape=None, layout='NCHW', model_files=['yolov4_quantized/deploy.caffemodel'], model_type='caffe', out_filename='./compiled/yolov4_org.xmodel', proto='yolov4_quantized/deploy.prototxt')
[INFO] caffe model: yolov4_quantized/deploy.caffemodel
[INFO] caffe model: yolov4_quantized/deploy.prototxt
[INFO] parse raw model     :100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 382/382 [00:16<00:00, 22.53it/s]                
[INFO] infer shape (NCHW)  :100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 382/382 [00:00<00:00, 9200.99it/s]              
[INFO] infer shape (NHWC)  :100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 382/382 [00:00<00:00, 5438.73it/s]              
[INFO] generate xmodel     :100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 382/382 [00:00<00:00, 845.43it/s]               
[INFO] generate xmodel: /workspace/yolov4_atlas/compiled/yolov4_org.xmodel
[UNILOG][INFO] The compiler log will be dumped at "/tmp/vitis-ai-user/log/xcompiler-20210323-071025-7300"
[UNILOG][INFO] Compile mode: dpu
[UNILOG][INFO] Debug mode: function
[UNILOG][INFO] Target architecture: DPUCZDX8G_CUSTOMIZED
[UNILOG][INFO] Graph name: deploy, with op num: 822
[UNILOG][INFO] Begin to compile...
module_infer = 0, counter_m[module_infer] = 10, counter_p[module_infer] = 10
module_idx = 0, counter_m[module_idx] = 10, counter_p[module_idx] = 10
module_idx = 1, counter_m[module_idx] = 0, counter_p[module_idx] = 0
module_idx = 2, counter_m[module_idx] = 8, counter_p[module_idx] = 8
module_idx = 3, counter_m[module_idx] = 0, counter_p[module_idx] = 0
[UNILOG][FATAL][XCOM_PM_FAIL][The compiler occurs an error when generating instructions, please contact us.] 
*** Check failure stack trace: ***
Aborted (core dumped)

To give you more details about what I've done. I somehow follow this tutorial (part 3), replacing the VOC dataset by my own, and then train a model using Darknet .

I then converted the model to caffe using the following command: python /opt/vitis_ai/conda/envs/vitis-ai-caffe/bin/convert.py yolov4.cfg yolov4_last.weights yolov4.prototxt yolov4.caffemodel

I changed the prototxt as specified in the tutorial and I run quantization as follow: vai_q_caffe quantize -model yolov4.prototxt -calib_iter 100 -weights yolov4.caffemodel -sigmoided_layers layer135-conv,layer146-conv,layer157-conv -output_dir yolov4_quantized/ -keep_fixed_neuron

For compilation, I add to remove a duplicated bottom field in one of the layer of the deploy.prototxt because of this error:

AssertionError: [ERROR] Invalid prototxt file: duplicate names found in the bottom field of layer (name: layer110-concat) in prototxt file: ['layer105-conv', 'layer107-conv', 'layer105-conv', 'layer104-conv']

Finally I ran compilation: vai_c_caffe --prototxt yolov4_quantized/deploy.prototxt --caffemodel yolov4_quantized/deploy.caffemodel --arch /opt/vitis_ai/compiler/arch/DPUCZDX8G/ULTRA96/arch.json --output_dir ./compiled --net_name yolov4 --options "{'save_kernel':''}" And got the error above.

FYI, I did try to compile for a ZCU102 and got the same error.

I did an additional test: I quantize without using the option -keep_fixed_neuron . Then the compilation was not erroring out, but I got zero DPU subgraph as shown below:

**************************************************
* VITIS_AI Compilation - Xilinx Inc.
**************************************************
[INFO] Namespace(inputs_shape=None, layout='NCHW', model_files=['yolov4_quantized/deploy.caffemodel'], model_type='caffe', out_filename='./compiled/yolov4_org.xmodel', proto='yolov4_quantized/deploy.prototxt')
[INFO] caffe model: yolov4_quantized/deploy.caffemodel
[INFO] caffe model: yolov4_quantized/deploy.prototxt
[INFO] parse raw model     :100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 253/253 [00:16<00:00, 15.02it/s]                
[INFO] infer shape (NCHW)  :100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 253/253 [00:00<00:00, 4397.95it/s]              
[INFO] infer shape (NHWC)  :100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 253/253 [00:00<00:00, 3426.66it/s]              
[INFO] generate xmodel     :100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 253/253 [00:00<00:00, 574.03it/s]               
[INFO] generate xmodel: /workspace/yolov4_atlas/compiled/yolov4_org.xmodel
[UNILOG][INFO] The compiler log will be dumped at "/tmp/vitis-ai-user/log/xcompiler-20210323-072413-7395"
[UNILOG][INFO] Target architecture: DPUCZDX8G_ISA0_B4096_MAX_BG2
[UNILOG][INFO] Compile mode: dpu
[UNILOG][INFO] Debug mode: function
[UNILOG][INFO] Target architecture: DPUCZDX8G_ISA0_B4096_MAX_BG2
[UNILOG][INFO] Graph name: deploy, with op num: 693
[UNILOG][INFO] Begin to compile...
[UNILOG][INFO] Total device subgraph number 2, DPU subgraph number 0
[UNILOG][INFO] Compile done.
[UNILOG][INFO] The meta json is saved to "/workspace/yolov4_atlas/./compiled/meta.json"
[UNILOG][INFO] The compiled xmodel is saved to "/workspace/yolov4_atlas/./compiled/yolov4.xmodel"
[UNILOG][INFO] The compiled xmodel's md5sum is 65617d0a8a96e1d0d4bffcae6fc4a574, and been saved to "/workspace/yolov4_atlas/./compiled/md5sum.txt"

Please find attached the following files (archived in a zip):

darknet.cfg: darknet config file use to train the yolov4 model on my own dataset
yolov4_before_quantization.prototxt: Caffe prototxt after conversion from darknet to caffe
deploy_with_keep_fixed_neurons.protoxt: Caffe prototxt after quantization using the keep_fixed_neurons options
deploy_without_keep_fixed_neurons.protoxt: Caffe prototxt after quantization not using the keep_fixed_neurons options config_files.zip

Hi @romaintha , Thanks so much for the detailed explanation and attachments. The yolov4 tutorial was originally written for VAI 1.2 and is in the process of being updated. One of the key issues that we encountered in VAI 1.3 was also a compilation error. I think the issue stems from the removal of the SPP module and resulting model architecture.

Rather than remove the SPP module, I would recommend re-inserting it and changing the kernel sizes to 3, 5, and 7 as the model in our model zoo does. I've updated your cfg file here. Could you please retrain the model with these changes and then try the flow again? Please make sure to use the -keep_fixed_neuron as well. I would also recommend adding the -method 1 switch as well to set the quantization mode.

Thanks and hope this helps!

Cdarknet.zip

Hi @jcory-xilinx, So I gave it a try changing the SPP module as you suggested by using the cfg you sent. I don't get the same compilation error now:

**************************************************
* VITIS_AI Compilation - Xilinx Inc.
**************************************************
[INFO] Namespace(inputs_shape=None, layout='NCHW', model_files=['yolov4_quantized/deploy.caffemodel'], model_type='caffe', out_filename='./compiled/yolov4_org.xmodel', proto='yolov4_quantized/deploy.prototxt')
[INFO] caffe model: yolov4_quantized/deploy.caffemodel
[INFO] caffe model: yolov4_quantized/deploy.prototxt
[INFO] parse raw model     :100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 386/386 [00:17<00:00, 22.59it/s]                
[INFO] infer shape (NCHW)  :100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 386/386 [00:00<00:00, 6148.21it/s]              
[INFO] infer shape (NHWC)  :100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 386/386 [00:00<00:00, 5258.41it/s]              
[INFO] generate xmodel     :100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 386/386 [00:00<00:00, 834.69it/s]               
[INFO] generate xmodel: /workspace/yolov4_atlas/compiled/yolov4_org.xmodel
[UNILOG][INFO] The compiler log will be dumped at "/tmp/vitis-ai-user/log/xcompiler-20210324-061937-824"
[UNILOG][INFO] Compile mode: dpu
[UNILOG][INFO] Debug mode: function
[UNILOG][INFO] Target architecture: DPUCZDX8G_CUSTOMIZED
[UNILOG][INFO] Graph name: deploy, with op num: 826
[UNILOG][INFO] Begin to compile...
[UNILOG][FATAL][XCOM_DATA_OUTRANGE][Data value is out of range!] 
*** Check failure stack trace: ***
Aborted (core dumped)

I read about this error and if I understand correctly it means that the board architecture is not able to handle such a model. Which is kind of weird because I have been compiling (and using for inference on the ULTRA96V2 board) a very similar model (pretrained) from your model zoo: https://www.xilinx.com/bin/public/openDownload?filename=dk_yolov4_coco_416_416_60.1G_1.3.zip

Unfortunately, I cannot find the darknet .cfg file you have been using to train this model from the zoo. If you could provide it, I could eventually tune it to fit my custom dataset.

Thanks

Hi @romaintha , What weights file did you use with this model when attempting to re-quantize/re-compile? Did you retrain the model before attempting to recompile it? The new error you are seeing I think might have to do with quantization issues (not necessarily related to a limitation on the number of filters for a particular DPU architecture). I suspect it could be that if the model architecture was changed and not retrained, you might see something like this. You may have also received warnings during quantization if this was the case.

Regarding the official model zoo model cfg file, I don't have a copy of this, but I have trained an 80 class yolov4-leaky relu model in darknet (on COCO), then converted it with the darknet script and was able to quantize and compile it using a similar approach to the tutorial. I also diff'd my converted prototxt file vs. the model zoo prototxt and the only resulting difference when using the attached cfg file below was the input size. Please see the cfg file below for reference.

yolov4-leaky.zip

Furthermore, I was able to compile this model to target a B1600, a B2304, as well as a B4096, so I don't think it is a DPU size limitation.

I was also made aware of a different workaround by the tutorial author which I've attached below. This resolution is a modification to the route layers in SPP module (The original version use -1,-3,-5,-6 for the route layer)

spp_module_modification

Thanks and hope this helps!

Hi @romaintha Did you use the -keep_fixed_neuron option when quantizing the model, for your latest results?

Can you share the prototxt file of the converted darknet model?

Hi @jcory-xilinx and @jimheaton Indeed the issue was in the quantization. I badly set up the input data with the calib file. But I am now running into another problem :( .

As you can see the compilation is successful (and I get 1DPU subgraph):

(vitis-ai-caffe) Vitis-AI /workspace/yolov4_atlas > vai_c_caffe --prototxt yolov4_quantized/deploy.prototxt --caffemodel yolov4_quantized/deploy.caffemodel --arch /opt/vitis_ai/compiler/arch/DPUCZDX8G/ULTRA96/arch.json --output_dir ./compiled --net_name yolov4 --options "{'save_kernel':''}"
**************************************************
* VITIS_AI Compilation - Xilinx Inc.
**************************************************
[INFO] Namespace(inputs_shape=None, layout='NCHW', model_files=['yolov4_quantized/deploy.caffemodel'], model_type='caffe', out_filename='./compiled/yolov4_org.xmodel', proto='yolov4_quantized/deploy.prototxt')
[INFO] caffe model: yolov4_quantized/deploy.caffemodel
[INFO] caffe model: yolov4_quantized/deploy.prototxt
[INFO] parse raw model     :100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 386/386 [00:18<00:00, 21.30it/s]                
[INFO] infer shape (NCHW)  :100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 386/386 [00:00<00:00, 5776.27it/s]              
[INFO] infer shape (NHWC)  :100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 386/386 [00:00<00:00, 4590.31it/s]              
[INFO] generate xmodel     :100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 386/386 [00:00<00:00, 723.09it/s]               
[INFO] generate xmodel: /workspace/yolov4_atlas/compiled/yolov4_org.xmodel
[UNILOG][INFO] The compiler log will be dumped at "/tmp/vitis-ai-user/log/xcompiler-20210325-065608-954"
[UNILOG][INFO] Compile mode: dpu
[UNILOG][INFO] Debug mode: function
[UNILOG][INFO] Target architecture: DPUCZDX8G_CUSTOMIZED
[UNILOG][INFO] Graph name: deploy, with op num: 826
[UNILOG][INFO] Begin to compile...
[UNILOG][INFO] Total device subgraph number 5, DPU subgraph number 1
[UNILOG][INFO] Compile done.
[UNILOG][INFO] The meta json is saved to "/workspace/yolov4_atlas/./compiled/meta.json"
[UNILOG][INFO] The compiled xmodel is saved to "/workspace/yolov4_atlas/./compiled/yolov4.xmodel"
[UNILOG][INFO] The compiled xmodel's md5sum is 97e03899915345f729a98527d1879abb, and been saved to "/workspace/yolov4_atlas/./compiled/md5sum.txt"

But when I am trying to run inference using this model on the board (ULTRA96V2, with PYNQ 2.6 image), the board freezes. While if I run the yolov4 from the zoo, I don't run into any issue. For the Zoo Yolov4, I directly compiled the provided quantized model. Is there any chances that a bad quantization can make a board freeze?

I attached here the command I run, its log output, and the input protoxt (created via darknet to caffe conversion) vai_q_caffe quantize -model yolov4.prototxt -calib_iter 1000 -weights yolov4.caffemodel -sigmoided_layers layer135-conv,layer146-conv,layer157-conv -output_dir yolov4_quantized/ -keep_fixed_neuron

quantization.zip

Thanks,

Hi @romaintha, I noticed in your quantize command, you are note using -method=1. I would recommend adding that flag as it will make a difference in the quantized model accuracy.

Also it looks like maybe you are using the version without the SPP module as this changes the number of layers in the model because you've specified layer135-conv,layer146-conv,layer157-conv as the output layers. In our Yolov4 model, the output layers are layer138-conv,layer149-conv,layer160-conv since the SPP module adds some layers.

If those are indeed your output layer names and you are using the Vitis AI Library code to run the model on the U96, you'll also need to update the models prototxt file (typically located on the target in the same directory as the *.xmodel file). This file will need to be updated with the correct "layer_name" strings as well as the correct number of classes.

Also, if you made any changes to the anchors, you'll need to update the bias values as well. Can you ensure you are using the correct values in the model's prototxt file?

E.G.)

model { name: "yolov4" kernel { name: "yolov4" mean: 0.0 mean: 0.0 mean: 0.0 scale: 0.00390625 scale: 0.00390625 scale: 0.00390625 } model_type : YOLOv3 yolo_v3_param { num_classes: 28 anchorCnt: 3 layer_name: "135" layer_name: "146" layer_name: "157" conf_threshold: 0.25 nms_threshold: 0.45 biases: 12 biases: 16 biases: 19 biases: 36 biases: 40 biases: 28 biases: 36 biases: 75 biases: 76 biases: 55 biases: 72 biases: 146 biases: 142 biases: 110 biases: 192 biases: 243 biases: 459 biases: 401 test_mAP: false } is_tf : false }

Thanks and hope this helps,

You are right, once we modified the SPP module, I did not update the output nodes.

But I requantize using this command: vai_q_caffe quantize -model yolov4.prototxt -calib_iter 1000 -weights yolov4.caffemodel -sigmoided_layers layer138-conv,layer149-conv,layer160-conv -output_dir yolov4_quantized/ -keep_fixed_neuron -method 1

Compilation is still ok, but the inference on board is still freezing the board. On the board I m using the VART with python (pynq-DPU), something looking like that:

from pynq_dpu import DpuOverlay
overlay = DpuOverlay("dpu.bit")
overlay.load_model("yolov4.xmodel")
dpu_runner = overlay.runner

input_tensors = dpu_runner.get_input_tensors()
output_tensors = dpu_runner.get_output_tensors()
input_height = input_tensors[0].dims[1]
input_width = input_tensors[0].dims[2]
batchSize = input_tensors[0].dims[0]
shapeIn = (batchSize,) + tuple([input_tensors[0].dims[i] for i in range(input_tensors[0].ndim)][1:])

img = preprocess_image(frame, (input_height, input_width))

input_data = [np.empty((shapeIn), dtype=np.float32, order='C')]
output_data = [np.empty(t.dims, dtype=np.float32, order='C') for t in output_tensors]

imageRun = input_data[0]
imageRun[0, ...] = img
job_id = dpu_runner.execute_async(input_data, output_data)
dpu_runner.wait(job_id)

Right after the dpu_runner.execute_async command, the board freezes. I'm using the exact same code to run the inference with the yolo from model zoo without any issue.

I also checked the post quantization prototxt differences between my custom trained model and the one from the zoo. Beside the negative slope param of the relu and the num_output of some filter params (due to different number of classes), the rest is identical. So the only difference could be the weights? But then I'm surprised, because if the weights were wrong, I would expect performance to be bad, but not the board freezing...

I uploaded here the post quantization prototxt and weights if that can help.

Hi @romaintha , I was able to quantize and compile your model and was able to run it on my ZCU102 (no freezing). When you say it is freezing, can you elaborate a bit on that? Does your Ultra96 reboot? Does Linux hang or just the application?

Note that there is an issue with the Ultra96 when the PL is heavily loaded that can cause the power supply to draw too much and put it in a current limit state which typically results in the board rebooting: https://www.element14.com/community/thread/72736/l/ultra96-v2-errata

If the software is simply freezing, I think it could be an issue where the post processing is trying to handle 80 classes when your model was trained on a different number of classes.

How many classes are you training on? I assume it is 28 since the final layer output is 99 filers (divide by 3, then subtract 5 to get 28).

Did you update the number of classes in software for the post processing?

Can you try running it with the vitis AI libraries using the attached prototxt file?

Thanks, yolov4_prototxt.zip

Hi @jcory-xilinx No the Ultra96 does not reboot by itself, Linux hangs completely (I can't even ssh to it, and the heartbeat led stop blinking).

It might be due to this current limit. I ll check that out.

For sure it is not due to the postprocessing because it hangs at the dpu_runner.execute_async command, which is simply outputing the yolo conv layers. The post processing is done after.

I badly set up the input data with the calib file

Hi @romaintha, @jcory-xilinx I am also quantizing my yolov4-custom and got same problem with no subgraph DPU with vitis AI 1.3. According to you note, it looks that at first I should care about calib data for quantization. Could you tell me what is your badly calib data setting?

Currently, I using some validation data and set the label to be 0 according to Vitis AI 1.3 User Guide. But no DPU subgraph is generated.

I'm also reading both your comments to make my yolov4-custom model work on Ultra96V2 with Vitis AI 1.3. Hope that I can contribute something and we can solve this problem for yolov4 (with Vitis AI 1.3 and Ultra96V2).

Thank you very much!

@hoadv-qh the error with no DPU subgraph was because I was not using the -keep_fixed_neuron option during quantization

Hi @hoadv-qh, if you are getting 0 DPU subgraphs, and are using the Caffe flow, it probably means that you did not include the: -keep_fixed_neuron option when you quantized.

This is something new needed for VAI-1.3, and we are still in the process of updating the tutorial to add this.

Hi @romaintha, @jimheaton

Thank you very much!

the error with no DPU subgraph was because I was not using the -keep_fixed_neuron option during quantization

My background is quite same with @romaintha from the first. I work with Caffe flow. My model has problem with duplicated layer name error due to SPP layer config.

my SPP layer (in *.cfg file)
[route]
layers=-1,-3,-5,-6
### End SPP ###

I fixed duplicated layer name by removing 1 duplicated layer105-conv in *.prototxt file, outputted by converter. After fixing duplicated layer error. I quantized and got result as below.

Without -keep_fixed_neuron, no DPU subgraph (as I said)
With -keep_fixed_neuron, got error about index (message similar format as first note of this issue discussion)

it probably means that you did not include the: -keep_fixed_neuron option when you quantized.

When reading Vitis AI 1.3 User Guide, it is said that we should specify this option for DPUCAHX8H. I think Ultra96V2's DPU type is DPUCZDX8G(?), so no need -keep_fixed_neuron?

According to @jcory-xilinx noted about modification for SPP layer, Should I change SPP config in *.cfg file?

Before:
    [route]
    layers=-1,-3,-5,-6

After:
    [route]
    layers=-1,-3,-4,-6

Can I just convert .cfg to .prototxt with original SPP layer config, and then just modify *.prototxt to use for next steps (quantize/compile)?

Hi @jcory-xilinx So I did a test on the Vitis AI 1.3 Ultra96 image. I had to change the fingerprint for compilation, but I observed the exact same behavior than on the Pynq image using the VART python runtime: everything hangs, the heartbeat led stop blinking and I have to manually restart the board to get access to anything.

For more details here is what I did:

I created a folder /usr/share/vitis_ai_library/models/yolov4_custom
In this folder I put my compiled model yolov4_custom.xmodel
I copied in this folder the prototxt from /usr/share/vitis_ai_library/models/yolov4_leaky_spp_m/ that I simply modfied with the number of classes
Then I run: ./test_jpeg_yolov4 yolov4_custom test_image.jpg

FYI, I wil try to upgrade the PMIC to check whether this is the root cause but this kind of operation is more tricky for me (I m not an hardware engineer). Also I have my doubt that this could be it since the board should reboot if the current limit is reached as far as I understood. But it is not what I observed. The board becomes fully unresponsive and I have to manually reboot it.

Hi @romaintha, I just downloaded the Ultra96v2 board image and recompiled the model to target this architecture. I was able to reproduce the hang issue and I think the cause was that I initially used the v1.3.1 GPU docker to compile the model.

In the instructions provided by Mario Bergeron below, he indicates to use a specific version of the docker corresponding to the v1.3.0 release.

https://www.hackster.io/AlbertaBeef/vitis-ai-1-3-flow-for-avnet-vitis-platforms-cd0c51

Specifically see the following steps:

6.1 If not done so already, pull version 1.3.411 of the docker container with the following command:

docker pull xilinx/vitis-ai:1.3.411 6.2 Launch version 1.3.411 of the Vitis-AI docker from the Vitis-AI directory:

$ cd $VITIS_AI_HOME $ sh -x docker_run.sh xilinx/vitis-ai:1.3.411

The model compilation process in this case will produce a file that is around 178MB, whereas the 1.3.1 docker will produce a model that is 67MB. I have now verified that the 1.3.1 compiled model hangs on the target whereas the 1.3.0 model (using the xilinx/vitis-ai:1.3.411 docker for compilation) runs correctly.

Thanks and hope this helps!

Hi @jcory-xilinx.

Base on your recommendation above, I modified SPP layer from "-1, -3, -5, -6" to "-1, -3, -4", -6" and re-trained my yolov4 model. Even though I added option -keep_fixed_neuron for quantization, I still got when compiled for Ultra96V2 target with Vitis AI 1.3. yolov4_modifiedSPP__compile_error

Is any way to debug what is not correct and fix it with above messages of compiler?

For your futher analysis, my quantization options are as below. yolov4_quantize_options

I reduced number of calib iteration to 16 for quickly checking the possibility of model. (if everything work, I will increase for better precision)
About sigmoid layers, I specified output layers (from messages of conversion, when converted from darknet to caffee)

I have additional questions about quantization.

(1) I got message of "loss: 0" when quantized, is it normal?
(2) I specified label is "0" for all images used for calib (according to Vitis AI 1.3 User Guide, it is just to correct the syntax of file). Is it OK to do that or I need to specify correct labels for images?

Hi @jcory-xilinx

I described error in previous note even modifying SPP layer as your comment. But, when I tried again with docker ver1.3.411, I could compile with 1 DPU subgraph.

[UNILOG][INFO] The compiler log will be dumped at "/tmp/vitis-ai-user/log/xcompiler-20210326-214619-122"
[UNILOG][INFO] Compile mode: dpu
[UNILOG][INFO] Debug mode: function
[UNILOG][INFO] Target architecture: DPUCZDX8G_ISA0_B2304_MAX_BG2
[UNILOG][INFO] Graph name: deploy, with op num: 823
[UNILOG][INFO] Begin to compile...
[UNILOG][INFO] Total device subgraph number 5, DPU subgraph number 1
[UNILOG][INFO] Compile done.

Looks that docker version also relates to error when compile model. Note.

I didn't change anything except docker version compare to my note on compilation error above. (same model, same quantization options, same compilation options)

Previously, I used Vitis AI docker with info as below.

Docker Image Version:  1.3.598
Build Date: 2021-02-18
VAI_ROOT: /opt/vitis_ai

Hi @jcory-xilinx Indeed, recompiling using the docker version you specified solve the issue. Thanks a lot for the support. Is there any place where this kind of information is centralized?

Also is there any issue in your opinion quantizing the model using the GPU docker version (it is way faster with GPU...) and then compiling using the CPU one?

Hi @romaintha , No problem and happy to hear this resolved your issue. I think in general it should be fine to quantize with the 1.3.1 or 1.3.2 version of the GPU docker as long as as you compile with the correct version of the docker. Regarding documentation, the main places are as follows. The first is a centralized on-line repository which gets updated by major release, the second is the git repo readme's which are the most update based on minor release/current master.

Vitis AI and DPU On-line documentation (https://www.xilinx.com/html_docs/vitis_ai/1_3/index.html)
Vitis AI Github (https://github.com/Xilinx/Vitis-AI)
UG1414 Vitis AI User Guide pdf (https://www.xilinx.com/support/documentation/sw_manuals/vitis_ai/1_3/ug1414-vitis-ai.pdf)
PG338 DPU User Guide pdf (https://www.xilinx.com/support/documentation/ip_documentation/dpu/v3_3/pg338-dpu.pdf)

Hi @hoadv-qh , Regarding your two questions: (1) I got message of "loss: 0" when quantized, is it normal?

Yes this is normal. loss is reported even though no loss is occurring (generally only non-zero during training).

(2) I specified label is "0" for all images used for calib (according to Vitis AI 1.3 User Guide, it is just to correct the syntax of file). Is it OK to do that or I need to specify correct labels for images?

yes '0' is fine for the label. labels are not actually used during quantize calibration step.

It sounds like the compile step is now working, however, if you would like to have your issue with compilation debugged further, you can either open a service request at our service portal (https://www.xilinx.com/support.html), or you can attach the files here (both quantize and compile scripts as well as prototxt and caffemodel files are needed).

Thanks and hope this helps,

@romaintha

Sorry, I have got same issue with you with latest version 1.3 (and followed latest guideline)

[INFO] Namespace(inputs_shape=None, layout='NCHW', model_files=['yolov4_quantized/deploy.caffemodel'], model_type='caffe', out_filename='yolov4_compiled/dpu_yolov4_voc_org.xmodel', proto='yolov4_quantized/deploy.prototxt')
[INFO] caffe model: yolov4_quantized/deploy.caffemodel
[INFO] caffe model: yolov4_quantized/deploy.prototxt
[INFO] parse raw model     :100%|███████████████████████████████████████████████████████████████████| 383/383 [00:10<00:00, 38.19it/s]
[INFO] infer shape (NCHW)  :100%|███████████████████████████████████████████████████████████████████| 383/383 [00:00<00:00, 7994.12it/s]
[INFO] infer shape (NHWC)  :100%|███████████████████████████████████████████████████████████████████| 383/383 [00:00<00:00, 6552.56it/s]
[INFO] generate xmodel     :100%|███████████████████████████████████████████████████████████████████| 383/383 [00:00<00:00, 651.84it/s]
[INFO] generate xmodel: /workspace/07-yolov4-tutorial/yolov4_compiled/dpu_yolov4_voc_org.xmodel
[UNILOG][INFO] The compiler log will be dumped at "/tmp/vitis-ai-user/log/xcompiler-20210414-082248-293"
[UNILOG][INFO] Target architecture: DPUCZDX8G_ISA0_B2304_MAX_BG2
[UNILOG][INFO] Compile mode: dpu
[UNILOG][INFO] Debug mode: function
[UNILOG][INFO] Target architecture: DPUCZDX8G_ISA0_B2304_MAX_BG2
[UNILOG][INFO] Graph name: deploy, with op num: 823
[UNILOG][INFO] Begin to compile...
module_infer = 0, counter_m[module_infer] = 17, counter_p[module_infer] = 17
module_idx = 0, counter_m[module_idx] = 17, counter_p[module_idx] = 17
module_idx = 1, counter_m[module_idx] = 0, counter_p[module_idx] = 0
module_idx = 2, counter_m[module_idx] = 0, counter_p[module_idx] = 0
module_idx = 3, counter_m[module_idx] = 0, counter_p[module_idx] = 0
[UNILOG][FATAL][XCOM_PM_FAIL][The compiler occurs an error when generating instructions, please contact us.]
*** Check failure stack trace: ***
Aborted (core dumped)

I think you solved it. Could you please tell me how to solve it?

Indeed the issue was in the quantization. I badly set up the input data with the calib file. But I am now running into another problem :( .

Below is modification in my prototxt file (images store validation images)

layer {
    name: "data"
    type: "ImageData"
    top: "data"
    top: "label"
    include {
      phase: TRAIN
    }
    transform_param {
      mirror: false
      yolo_height:416  #change height according to Darknet model
      yolo_width:416   #change width according to Darknet model
    }
    image_data_param {
      source: "model_data/calib.txt"  #list of calibration imaages     
      root_folder: "images/" #path to calibartion images

      batch_size: 1
      shuffle: false
    }
}

Thanks in advance!

Hi @quyetvk There were several issues within this "issue" The first compilation error I got was because I remove the SPP block from the darknet config. You might have done the same.

Hi @romaintha

Thank you for your quick reply.

There were several issues within this "issue" The first compilation error I got was because I remove the SPP block from the darknet config. You might have done the same.

No, I followed the latest tutorial to modify darknet config (that is same as https://github.com/Xilinx/Vitis-AI/issues/346#issuecomment-805730810) I noticed that maybe the issue related to docker version. I will do retry with another docker version.

Hi @romaintha, I noticed in your quantize command, you are note using -method=1. I would recommend adding that flag as it will make a difference in the quantized model accuracy.

Also it looks like maybe you are using the version without the SPP module as this changes the number of layers in the model because you've specified layer135-conv,layer146-conv,layer157-conv as the output layers. In our Yolov4 model, the output layers are layer138-conv,layer149-conv,layer160-conv since the SPP module adds some layers.

If those are indeed your output layer names and you are using the Vitis AI Library code to run the model on the U96, you'll also need to update the models prototxt file (typically located on the target in the same directory as the *.xmodel file). This file will need to be updated with the correct "layer_name" strings as well as the correct number of classes.

Also, if you made any changes to the anchors, you'll need to update the bias values as well. Can you ensure you are using the correct values in the model's prototxt file?

E.G.)

model { name: "yolov4" kernel { name: "yolov4" mean: 0.0 mean: 0.0 mean: 0.0 scale: 0.00390625 scale: 0.00390625 scale: 0.00390625 } model_type : YOLOv3 yolo_v3_param { num_classes: 28 anchorCnt: 3 layer_name: "135" layer_name: "146" layer_name: "157" conf_threshold: 0.25 nms_threshold: 0.45 biases: 12 biases: 16 biases: 19 biases: 36 biases: 40 biases: 28 biases: 36 biases: 75 biases: 76 biases: 55 biases: 72 biases: 146 biases: 142 biases: 110 biases: 192 biases: 243 biases: 459 biases: 401 test_mAP: false } is_tf : false }

Thanks and hope this helps,

Hey, How can I find the right bias values for a self-trained model used pytorch?

Hi @romaintha Has this issue been solved?

Hi @romaintha Since we haven't received your reply for a long time, we assume you have solved this issue and I'm going to close it. If you still have any questions, please feel free to reopen it. Thank you very much.

Xilinx / Vitis-AI

Error on compiling a yolov4 #346