doonny / PipeCNN

An OpenCL-based FPGA Accelerator for Convolutional Neural Networks
Apache License 2.0
1.23k stars 370 forks source link

Data stream of the maxPool Kernel #105

Closed Frankzd closed 3 years ago

Frankzd commented 5 years ago

Hi @doonny , first of all , I have to say what a great work you have done, and I have learned a lot from your paper as well as the code that you open sourced.

But as I went through the source code, I found that the maxPool kernel is actually reading data from the global memory instead of the channel

Could you please tell me if there is anything that I missunderstood?

thx!

saman-aghazadeh commented 5 years ago

I remember maxPool was between the conv and memWr, and it was consuming the data from the channel data of the convolution. Based on my profiling, the maxPool was stalling a lot on that channel. Maybe that's the reason for executing the pooling after memWr. I'm not sure if that would really help or not, since the stall may just propagate from the pooling to the memWr. I still have to profile this new version of the code to make sure.

Frankzd commented 5 years ago

@saman-aghazadeh Thank you for your reply!

I also have another question. In the memWrite kernel, it will send the pool_on_signal to tell the maxPool to start pooling.Here is the code: if(pool_on == 1) { if((global_x == out_dim1 - 1) && (global_y > 0) && ((global_y - pool_size + 1) % 2 == 0) && (local_z == LANE_NUM - 1)) { write_channel_intel(pool_sync_ch, pool_on_signal); } } The question is that why are we sending pool_on_signal every two iter in y_dimension? I suppose it should be 3 if the pool_size is 3. Do you have any iead about this question? thx!

doonny commented 5 years ago

We have updated the architecture many times since the initial upload. So the design is no longer the same one as the research paper reported.