Closed Frankzd closed 3 years ago
I remember maxPool was between the conv
and memWr
, and it was consuming the data from the channel data of the convolution. Based on my profiling, the maxPool
was stalling a lot on that channel. Maybe that's the reason for executing the pooling after memWr
. I'm not sure if that would really help or not, since the stall may just propagate from the pooling to the memWr
. I still have to profile this new version of the code to make sure.
@saman-aghazadeh Thank you for your reply!
I also have another question. In the memWrite
kernel, it will send the pool_on_signal
to tell the maxPool
to start pooling.Here is the code:
if(pool_on == 1) { if((global_x == out_dim1 - 1) && (global_y > 0) && ((global_y - pool_size + 1) % 2 == 0) && (local_z == LANE_NUM - 1)) { write_channel_intel(pool_sync_ch, pool_on_signal); } }
The question is that why are we sending pool_on_signal every two iter in y_dimension?
I suppose it should be 3 if the pool_size is 3.
Do you have any iead about this question?
thx!
We have updated the architecture many times since the initial upload. So the design is no longer the same one as the research paper reported.
Hi @doonny , first of all , I have to say what a great work you have done, and I have learned a lot from your paper as well as the code that you open sourced.
But as I went through the source code, I found that the maxPool kernel is actually reading data from the global memory instead of the channel
Could you please tell me if there is anything that I missunderstood?
thx!