Retrain new networks/datasets using DietCNN on FPGA

Hi @sticktotheend , thank you for your kind words, sorry for the late reply. I think it is better to start with the convolutional backbone first. I will rewrite the steps from the paper:

We can train a codebook on the training dataset for Yolo - maybe VOC or COCO. Or I guess a generic codebook trained on ImageNet will also work.
Then we can discretize the input image,
I can see that the first conv layer has 7x7x64 filters, then there is a 2x2 maxpool and then 3x3x192 conv. So we need to double the first conv layer strides to match the input FMs for the second conv (there is no pooling in DietCNN)
Once we have the modified DietCNN architecture chalked out, we can discretize the filter of the backbone convolutional layers

I would suggest you start by converting one layer first: image --> symbolic image --> symbolic layer (first conv) --> reverse lookup symbol to real number --> rest of Yolo

In the FPGA driver code we can put the first layer, pass input and get back intermediate FMs

I started in this manner and it gave me confidence. Please let me know about your progress. Cheers!

swadeykgp / DietCNN

Retrain new networks/datasets using DietCNN on FPGA #2