swadeykgp / DietCNN

MIT License
10 stars 0 forks source link

Retrain new networks/datasets using DietCNN on FPGA #2

Open sticktotheend opened 1 year ago

sticktotheend commented 1 year ago

Hi~ I've learned a lot from your work. I am just a beginner in this field and I want to retrain a new network like Yolo using DietCNN and aim to implement it to FPGA for my subject of hardware acceleration. So, would you like to give me some suggestions or follows if it is possible?

swadeykgp commented 1 year ago

Hi @sticktotheend , thank you for your kind words, sorry for the late reply. I think it is better to start with the convolutional backbone first. I will rewrite the steps from the paper:

  1. We can train a codebook on the training dataset for Yolo - maybe VOC or COCO. Or I guess a generic codebook trained on ImageNet will also work.
  2. Then we can discretize the input image,
  3. I can see that the first conv layer has 7x7x64 filters, then there is a 2x2 maxpool and then 3x3x192 conv. So we need to double the first conv layer strides to match the input FMs for the second conv (there is no pooling in DietCNN)
  4. Once we have the modified DietCNN architecture chalked out, we can discretize the filter of the backbone convolutional layers

I would suggest you start by converting one layer first: image --> symbolic image --> symbolic layer (first conv) --> reverse lookup symbol to real number --> rest of Yolo

In the FPGA driver code we can put the first layer, pass input and get back intermediate FMs

I started in this manner and it gave me confidence. Please let me know about your progress. Cheers!