Closed ha-zhuzhu closed 8 months ago
Hi @ha-zhuzhu, Thank you for your interest in BARVINN!
First I have to mention that the documentation is a bit out of date and we need to update it.
Regarding the CSRs, I actually recommend you to use this file as a reference instead. We use this json file to generate configuration for Systemverilog and C header files. For the conv2d
example, you can leave the shacc_load_sel
and zigzag_step_sel
as it is.
For multi-layer quantized models we need communication with the host and we are currently building a memory interface for it. However, if your model is small enough and you can store it entirely on the available memory space, you can run your your model by running different kernels back to back. This is not convenient and not realistic, but our main objective now is to provide an interface for all MVUs to communicate with the host. Please let me know if you have more questions.
Thank you for your detailed explanation! I finally get the exact same results in quant_out
as output.hex
after some changes, I'm not sure whether I understand the source code correctly:
deps/MVU/c/conv2d_jumps.c:381
?shacc_load_sel
to b00001
and zigzag_step_sel
to b00101
in order to sum up a single pixel's result if WLENGTH_1=0. ( shacc_load_sel
should be b00010
and zigzag_step_sel
be b00110
if WLENGTH_1!=0).__process_weigths
, the actual onnx weights shape is [output_channels, input_channels, height, width] instead of [input_channels, output_channels, width, height].get_mvu_param
:
ijump[0] = -iprec*(iC*(fH-1)*iW + (fW-sW-1)*iC + 1)
# why not:
# ijump[0] = -iprec*(iC*(fH-1)*iW + (fW-sW)*iC -1)
Because in SimpleConv model, iC=1, sW=1, which makes ijump[0]=ijump[1], kernel will never move right by a stride.
Now I can get the right result, still wondering how to do padding...
And about the multi-layer quantized model. As each conv layer has different scale in common quantize strategies, I'm curious about how to rescale the 32-bit conv result of MVP (with scale1) and quantize it (with scale2) to be the input of next layer. Should I merge scale1 and scale2 into one so I can use Scaler and Quantser to process it? This seems like a feasible way.
@ha-zhuzhu regarding the scalers, the idea is to merge/fuse scalars together such that you only need the one scalar unit. We don't have a multiplier in the quantser
module as a result. This scheme is sufficient to implement LSQ and even batch norm layers (with scalar fusing). We are working on another branch of the MVU code that will have a second scalar unit to add some flexibility.
Hi, thanks for your brilliant work! The design of MVU and hart control is impressive. I'm tring to get the right simmulation output.hex by
fusesoc run --target=sim barvinn
. But when I read the waves, I find that some signals related to scaler, bias or other configs unset in the c code are atx
state, resulting the final all-zeroquant_out
.So I try to set these configs in
conv2d.c
by my understanding to the project:Now I'm cunfused with
CSR_MVUCONFIG1
. The docs indicates that it's aboutShift/accumulator load on jump select
andShift/accumulator load on jump select
, and they're all 8-bit respectively. This does match with the comment inBARVINN/deps/MVU/verification/lib/mvu/mvu_pkg.sv
(branch 72b5413) .While in
mvutop_wrapper.sv
andmvutop_wrapper.sv
,CSR_MVUCONFIG1
seems to controlshacc_load_sel
andzigzag_step_sel
, 5-bit respectively.These two configs seem to be important in cumputing, but
MVU_Code_gen
doesn't export them. I also find them in some test files likeMVU/c/conv2d_jumps.c
ormvutop_tester.sv
, but still don't quite get the relationship between them and other model parameters. If I just want to run the sample conv2d (1x64x32x32
input,64x64x3x3
weight and 2-bit precision), how can I calculateshacc_load_sel
andzigzag_step_sel
?I also have a question about quantized model computation. At the end of a layer's computation,
scaler
module can rescale a quantized output. And the next layer's input should be quantized again. But It seems that thequantser
module is not able to quantize inputs by a certain scale. So how does BARVINN deal with this process? I'd appreciate it if you can offer some examples about multi-layers quantized model!