HW-Emulation taking too long

UCLA-VAST / FlexCNN

BSD 3-Clause "New" or "Revised" License

70 stars 17 forks source link

HW-Emulation taking too long #17

Closed anujp10 closed 3 years ago

anujp10 commented 4 years ago

Hello,

I was trying to run the original SDx Project with only 1 memory bank active i.e. I used "XCL_MEM_DDR_BANK0" for all the input buffers namely "buffer_cin", "buffer_weight", "buffer_bias", and "buffer_config". The simulation took almost 35ms and it still didn't finish. I have the following questions regarding the above use case:

Is the runtime of the "hw-simulation" expected to be 24.7ms?
Also, if I use 1 memory bank as suggested above will it take more time?

Also, I used the default memory bank configuration (multiple memory banks) as posted in the source code and tried building the system configuration, and I encountered a routing error.

Can you please comment on the above questions?

Regards

atefehsz commented 3 years ago

The time to run the HW emulation depends on the number of layers you are using and the resolution of each, but generally, it should be much longer than what you got. For the network in the paper, it will take several days.
Yes, using 1 memory bank reduces the performance as you will access memory for all the inputs serially.

What is your target FPGA? Probably, you are overfitting it. You may either reduce the resource consumption or move to a bigger FPGA.

anujp10 commented 3 years ago

Thanks for your reply. I wanted to ask you if I am using the same network as suggested in the paper, should the expected runtime be 24.7ms? Because I tried running the network suggested in the paper with 1 memory bank and ran it for 35ms, but it didn't finish. Also, the FPGA I used was the VU9P (AWS F1 instance) with the same resources as suggested in the paper. So I don't think so I am overfitting it, because I am using the same resources as suggested in the paper.

atefehsz commented 3 years ago

24.7ms is for when you are running it on-board, not the emulation time. The HW emulation takes much longer. When you are using the FPGA on AWS, you should decrease the resource utilization as the tool uses more logic for the interface.