Closed chhzh123 closed 3 years ago
.reuse_at
cannot capture this kind of pattern and is not able to move B_pipe_1
outside.
This line of code specifies what tensor to be reused, and HeteroCL recognizes it but does not generate the corresponding C code. (The C code is totally the same before and after adding this instruction.)
s.reuse_at(kernel.B,s[kernel.C],kernel.B.axis[0],"LB")
Actually, this problem can be found when using hls::stream
to do C simulation. After changing the code below,
void test(ap_uint<8> A[10], ap_uint<8> C[10]) {
hls::stream<ap_uint<8> > B_pipe_1;
#pragma HLS dataflow
#pragma HLS stream variable=B_pipe_1 depth=1
B_i: for (bit32 i = 0; i < 10; ++i) {
ap_uint<8> B_temp;
B_temp = ((ap_uint<8>)(((ubit32)A[i]) + 1U));
B_pipe_1.write(B_temp);
}
ap_uint<8> LB;
C_i1: for (bit32 i1 = 0; i1 < 10; ++i1) {
bit32 sum;
sum = 0;
C_ra0: for (bit32 ra0 = 0; ra0 < 8; ++ra0) {
ap_uint<8> B_temp1;
B_temp1 = B_pipe_1.read();
sum = ((bit32)(((ap_int<34>)B_temp1[ra0]) + ((ap_int<34>)sum)));
}
C[i1] = ((ap_uint<8>)sum);
}
}
Vivado HLS will give the warning.
WARNING: Hls::stream 'hls::stream<ap_uint<8>, 0>.1' is read while empty, which may result in RTL simulation hanging.
@seanlatias any comment on the reuse_at problem?
I guess the reason is that, for current reuse_at algorithm, I assume the sliding window moves. But in this case, the sliding window is stationary (i.e., it is not sliding).
Or maybe more specific, I only handle the case when slide=1
. This is the case where slide=0
.
Can we first prompt a warning when we fail to do data reuse?
Vivado HLS will give the warning.
WARNING: Hls::stream 'hls::stream<ap_uint<8>, 0>.1' is read while empty, which may result in RTL simulation hanging.
However, Vivado HLS does not give the exact line that causes the warning, which would be intractable when debugging a large design. Thus, it's necessary for HeteroCL to count the number of FIFO reads and writes, and prompt errors when they are not consistent.
Before we use static analysis, we should first add some support for runtime validation in HeteroCL. I believe we can leverage many Python-specific features to instrument the code and check that the legality of the stream/FIFO accesses --- they must be continuous and non-repetitive, plus the consumption rate must match the production.
Before we use static analysis, we should first add some support for runtime validation in HeteroCL. I believe we can leverage many Python-specific features to instrument the code and check that the legality of the stream/FIFO accesses --- they must be continuous and non-repetitive, plus the consumption rate must match the production.
This is actually what I can do for my course project.
When doing data streaming, it's important to guarantee the numbers of FIFO reads and writes are the same. We need analysis for streaming buffers to ensure they work in a correct way.
Following is an example that illustrates inconsistent FIFO reads and writes, where
C
accumulates the bits in each element ofB
.Currently, HeteroCL only replaces the original buffers to streaming buffers without further code transformation, which causes reading empty FIFO in stage
C
.In this case,
B_pipe_1
should be read outside the inner loop. Otherwise, it will be read 80 times, though it only has 10 elements.