Open kouzhentao opened 6 years ago
For input image larger than CBUF banks assigned for it, you have to split it manually (the methods of splitting is present in nvdla doc) or let the compiler to do this job.
@ddkkevin thanks for ur reply. Do you mean split_h part in programming_guide (attached below)? I also found a log file in your repo, can I take this for reference? by the way, the split mentioned in BUBIK module is not for this purpose, right? We have to do this job manually without nvdla compiler for nv_small, right? thanks.
https://github.com/ddkkevin/nvdla_log/blob/master/nn_mine.cfg
http://nvdla.org/hw/v1/ias/programming_guide.html?highlight=split
Split H We can see only full mode is supported by convolution pipeline. If one network layer has large input which exceed the CONV_BUF capacity, software has to split the big input cube into smaller cubes in vertical direction. This mechanism called “Split H mode”.
Be noticed that there must be max(R-stride_y, 0) overlapped lines between 2 consecutive cube to make sure the convolution results are expected.
Yes, Split H mode is used under this condition. Without the help of the compiler, you have to do it manually. The split mentioned in BUBIK has no relationship to Split H mode and it means to split a feature cube into multiple smaller cubes. nn_mine.cfg is just the trace extracted from NN_L0_1_small_fbuf test. How will you use it?
@ddkkevin do you know if the current prebuilt nv_large compiler support split H function?
Hi,
In case of input image/feature is too large for CBUF to process in one time, NVDLA need fetch multi times image/feature/weight data input CBUF. How to config it? The following rtl code shows CSC will start working only when input slices are more than configed data_in_height. In this case, input image height is too large to load into CBUF. Is this means the data_in_height register be configed with a small value to fit with CBUF? Can NVDLA deal with this automatically, Which means the register configed with the real image size, the internal DMA fetch data until CBUF is full, when CBUF is released, DMA will fetch remain data. Why CBUF ready is asserted when all image data is loaded? CSC can start to work when stripe number of data is ready.
assign dat_cbuf_ready = (slices_avl >= data_in_height[13:0]);
thans Kou