nvdla / hw

RTL, Cmodel, and testbench for NVDLA
Other
1.71k stars 565 forks source link

how to implement depthwise separable convolution on NVDLA ? #251

Closed zhouweiscut closed 5 years ago

zhouweiscut commented 5 years ago

hi all, do anyone have any ideas to implement depthwise separable convolution on NVDLA ? for example, depthwise separable convolution is divided to two steps: (1)depthwise conv: weight 33/1, input data 12812864; (2) pointwise conv: weight 11*16/1.

when mapping depthwise conv to NVDLA MAC, the utilization is very low. do anyone have some ways to solve it ?

ghost commented 5 years ago

@nookfoo @shgangchen @mmaciag@zhzhouweiscut On ZCU102 Board Test Nv_small. = Run NN/NN_L0_1_small_fbuf creating new runtime context... Emulator starting submitting tasks... [ 2210.896668] Enter:dla_read_network_config [ 2210.903672] Exit:dla_read_network_config status=0 [ 2210.909643] Enter: dla_initiate_processors [ 2210.915004] Enter: dla_submit_operation [ 2210.920099] Prepare Convolution operation index 0 ROI 0 dep_count 1 [ 2210.927667] Enter: dla_prepare_operation [ 2210.932896] processor:Convolution group:1, rdma_group:0 available [ 2210.940346] Enter: dla_read_config [ 2210.945125] Exit: dla_read_config [ 2210.949831] Exit: dla_prepare_operation status=0 [ 2210.955870] Enter: dla_program_operation [ 2210.961236] Program Convolution operation index 0 ROI 0 Group[1] [ 2210.968764] no desc get due to index==-1 [ 2210.974179] no desc get due to index==-1 [ 2210.979577] no desc get due to index==-1 [ 2210.984954] no desc get due to index==-1 [ 2210.990313] no desc get due to index==-1 [ 2210.995644] Enter: dla_op_programmed [ 2211.000630] Update dependency operation index 2 ROI 0 DEP_COUNT=2 [ 2211.008178] Update dependency operation index 1 ROI 0 DEP_COUNT=1 [ 2211.015723] enable SDP in dla_update_dependency as depdency are resolved [ 2211.023905] Enter: dla_enable_operation [ 2211.029220] exit dla_enable_operation without actual enable due to processor hasn't been programmed [ 2211.039839] Exit: dla_enable_operation status=0 [ 2211.045950] Exit: dla_op_programmed [ 2211.050997] Exit: dla_program_operation status=0 [ 2211.057176] Exit: dla_submit_operation [ 2211.062484] Enter: dla_dequeue_operation [ 2211.067953] Dequeue op from Convolution processor, index=2 ROI=0 [ 2211.075536] Enter: dla_submit_operation [ 2211.080938] Prepare Convolution operation index 2 ROI 0 dep_count 1 [ 2211.088792] Enter: dla_prepare_operation [ 2211.094292] processor:Convolution group:0, rdma_group:0 available [ 2211.101991] Enter: dla_read_config [ 2211.106995] Exit: dla_read_config [ 2211.111900] Exit: dla_prepare_operation status=0 [ 2211.118105] Enter: dla_program_operation [ 2211.123600] Program Convolution operation index 2 ROI 0 Group[0] [ 2211.131226] no desc get due to index==-1 [ 2211.136737] no desc get due to index==-1 [ 2211.142218] no desc get due to index==-1 [ 2211.147668] no desc get due to index==-1 [ 2211.153083] no desc get due to index==-1 [ 2211.158471] Enter: dla_op_programmed [ 2211.163499] Update dependency operation index 6 ROI 0 DEP_COUNT=3 [ 2211.171072] Update dependency operation index 3 ROI 0 DEP_COUNT=2 [ 2211.178616] Exit: dla_op_programmed [ 2211.183552] Exit: dla_program_operation status=0 [ 2211.189625] Exit: dla_submit_operation [ 2211.194807] Exit: dla_dequeue_operation [ 2211.200064] Enter: dla_submit_operation [ 2211.205327] Prepare SDP operation index 1 ROI 0 dep_count 0 [ 2211.212329] Enter: dla_prepare_operation [ 2211.217670] processor:SDP group:0, rdma_group:1 available [ 2211.224484] Enter: dla_read_config [ 2211.229290] Exit: dla_read_config [ 2211.233999] Exit: dla_prepare_operation status=0 [ 2211.240023] Enter: dla_program_operation [ 2211.245346] Program SDP operation index 1 ROI 0 Group[0] [ 2211.252076] no desc get due to index==-1 [ 2211.257384] no desc get due to index==-1 [ 2211.262694] no desc get due to index==-1 [ 2211.268009] no desc get due to index==-1 [ 2211.273323] Enter: dla_op_programmed [ 2211.278305] Update dependency operation index 3 ROI 0 DEP_COUNT=1 [ 2211.285843] enable SDP in dla_update_dependency as depdency are resolved [ 2211.293989] Enter: dla_enable_operation [ 2211.299287] exit dla_enable_operation without actual enable due to processor hasn't been programmed [ 2211.309866] Exit: dla_enable_operation status=0 [ 2211.315901] Exit: dla_op_programmed [ 2211.320850] Exit: dla_program_operation status=0 [ 2211.326893] Enter: dla_enable_operation [ 2211.332148] Enable SDP operation index 1 ROI 0 [ 2211.338022] Enter: dla_op_enabled [ 2211.342757] Update dependency operation index 0 ROI 0 DEP_COUNT=1 [ 2211.350310] enable Convolution in dla_update_dependency as depdency are resolved [ 2211.359193] Enter: dla_enable_operation [ 2211.364502] Enable Convolution operation index 0 ROI 0 [ 2211.371107] Enter: dla_op_enabled [ 2211.375865] Exit: dla_op_enabled [ 2211.380508] Exit: dla_enable_operation status=0 [ 2211.386447] Exit: dla_op_enabled [ 2211.391083] Exit: dla_enable_operation status=0 [ 2211.397023] Exit: dla_submit_operation [ 2211.402167] Enter: dla_dequeue_operation [ 2211.407468] Dequeue op from SDP processor, index=3 ROI=0 [ 2211.414162] Enter: dla_submit_operation [ 2211.419400] Prepare SDP operation index 3 ROI 0 dep_count 0 [ 2211.426425] Enter: dla_prepare_operation [ 2211.431824] processor:SDP group:1, rdma_group:0 available [ 2211.438743] Enter: dla_read_config [ 2211.443667] Exit: dla_read_config [ 2211.448470] Exit: dla_prepare_operation status=0 [ 2211.454603] Enter: dla_program_operation [ 2211.460027] Program SDP operation index 3 ROI 0 Group[1] [ 2211.466866] no desc get due to index==-1 [ 2211.472307] no desc get due to index==-1 [ 2211.477739] no desc get due to index==-1 [ 2211.483155] no desc get due to index==-1 [ 2211.488544] Enter: dla_op_programmed [ 2211.493573] Update dependency operation index 7 ROI 0 DEP_COUNT=2 [ 2211.501145] Exit: dla_op_programmed [ 2211.506089] Exit: dla_program_operation status=0 [ 2211.512179] Enter: dla_enable_operation [ 2211.517476] Enable SDP operation index 3 ROI 0 [ 2211.523382] Enter: dla_op_enabled [ 2211.528145] Update dependency operation index 2 ROI 0 DEP_COUNT=1 [ 2211.535711] enable Convolution in dla_update_dependency as depdency are resolved [ 2211.544641] Enter: dla_enable_operation [ 2211.550034] Enable Convolution operation index 2 ROI 0 [ 2211.556769] Enter: dla_op_enabled [ 2211.561671] Exit: dla_op_enabled [ 2211.566466] Exit: dla_enable_operation status=0 [ 2211.572550] Exit: dla_op_enabled [ 2211.577296] Exit: dla_enable_operation status=0 [ 2211.583307] Exit: dla_submit_operation [ 2211.588484] Exit: dla_dequeue_operation [ 2211.593743] Enter: dla_submit_operation [ 2211.598996] Prepare PDP operation index 5 ROI 0 dep_count 1 [ 2211.606013] Enter: dla_prepare_operation [ 2211.611383] processor:PDP group:1, rdma_group:1 available [ 2211.618250] Enter: dla_read_config [ 2211.623101] Exit: dla_read_config [ 2211.627832] Exit: dla_prepare_operation status=0 [ 2211.633842] Enter: dla_program_operation [ 2211.639146] Program PDP operation index 5 ROI 0 Group[1] [ 2211.645852] group id 1 rdma id 1 [ 2211.650493] no desc get due to index==-1 [ 2211.655813] no desc get due to index==-1 [ 2211.661119] no desc get due to index==-1 [ 2211.666410] no desc get due to index==-1 [ 2211.671689] no desc get due to index==-1 [ 2211.676943] Enter: dla_op_programmed [ 2211.681837] Update dependency operation index 11 ROI 0 DEP_COUNT=2 [ 2211.689363] Exit: dla_op_programmed [ 2211.694182] Exit: dla_program_operation status=0 [ 2211.700123] Exit: dla_submit_operation [ 2211.705173] Enter: dla_dequeue_operation [ 2211.710387] Dequeue op from PDP processor, index=11 ROI=0 [ 2211.717109] Enter: dla_submit_operation [ 2211.722266] Prepare PDP operation index 11 ROI 0 dep_count 1 [ 2211.729268] Enter: dla_prepare_operation [ 2211.734523] processor:PDP group:0, rdma_group:0 available [ 2211.741271] Enter: dla_read_config [ 2211.746045] Exit: dla_read_config [ 2211.750733] Exit: dla_prepare_operation status=0 [ 2211.756763] Enter: dla_program_operation [ 2211.762096] Program PDP operation index 11 ROI 0 Group[0] [ 2211.768910] group id 0 rdma id 0 [ 2211.773538] no desc get due to index==-1 [ 2211.778850] no desc get due to index==-1 [ 2211.784149] no desc get due to index==-1 [ 2211.789427] no desc get due to index==-1 [ 2211.794682] no desc get due to index==-1 [ 2211.799906] Enter: dla_op_programmed [ 2211.804748] Update dependency operation index 22 ROI 0 DEP_COUNT=2 [ 2211.812207] Exit: dla_op_programmed [ 2211.816961] Exit: dla_program_operation status=0 [ 2211.822846] Exit: dla_submit_operation [ 2211.827860] Exit: dla_dequeue_operation [ 2211.832958] Enter: dla_submit_operation [ 2211.838041] Prepare CDP operation index 4 ROI 0 dep_count 2 [ 2211.844894] Enter: dla_prepare_operation [ 2211.850106] processor:CDP group:1, rdma_group:1 available [ 2211.856828] Enter: dla_read_config [ 2211.861562] Exit: dla_read_config [ 2211.866189] Exit: dla_prepare_operation status=0 [ 2211.872116] Enter: dla_program_operation [ 2211.877340] Program CDP operation index 4 ROI 0 Group[1] [ 2211.883957] Enter: dla_cdp_program [ 2211.883960] Enter: processor_cdp_program [ 2211.893984] Exit: processor_cdp_program [ 2211.893985] Exit: dla_cdp_program [ 2211.899117] no desc get due to index==-1 [ 2211.908890] no desc get due to index==-1 [ 2211.914053] no desc get due to index==-1 [ 2211.919231] no desc get due to index==-1 [ 2211.924404] no desc get due to index==-1 [ 2211.929580] Enter: dla_op_programmed [ 2211.934423] Update dependency operation index 10 ROI 0 DEP_COUNT=3 [ 2211.941907] Exit: dla_op_programmed [ 2211.946681] Exit: dla_program_operation status=0 [ 2211.952573] Exit: dla_submit_operation [ 2211.957589] Enter: dla_dequeue_operation [ 2211.962782] Dequeue op from CDP processor, index=10 ROI=0 [ 2211.969474] Enter: dla_submit_operation [ 2211.974592] Prepare CDP operation index 10 ROI 0 dep_count 2 [ 2211.981545] Enter: dla_prepare_operation [ 2211.986748] processor:CDP group:0, rdma_group:0 available [ 2211.993436] Enter: dla_read_config [ 2211.998119] Exit: dla_read_config [ 2212.002687] Exit: dla_prepare_operation status=0 [ 2212.008579] Exit: dla_submit_operation [ 2212.013591] Exit: dla_dequeue_operation [ 2212.018685] Exit: dla_initiate_processors status=0 [ 2212.024728] Enter:dla_handle_events, processor:BDMA [ 2212.030851] Exit:dla_handle_events, ret:0 [ 2212.036104] Enter:dla_handle_events, processor:Convolution [ 2212.042876] Exit:dla_handle_events, ret:0 [ 2212.048188] Enter:dla_handle_events, processor:SDP [ 2212.054317] Exit:dla_handle_events, ret:0 [ 2212.059688] Enter:dla_handle_events, processor:PDP [ 2212.065855] Exit:dla_handle_events, ret:0 [ 2212.071217] Enter:dla_handle_events, processor:CDP [ 2212.077357] Exit:dla_handle_events, ret:0 [ 2212.082717] Enter:dla_handle_events, processor:RUBIK [ 2212.089054] Exit:dla_handle_events, ret:0 [ 2212.094454] Enter:dla_handle_events, processor:BDMA [ 2212.100733] Exit:dla_handle_events, ret:0 [ 2212.106130] Enter:dla_handle_events, processor:Convolution [ 2212.113015] Exit:dla_handle_events, ret:0 [ 2212.118409] Enter:dla_handle_events, processor:SDP [ 2212.124584] Exit:dla_handle_events, ret:0 [ 2212.129985] Enter:dla_handle_events, processor:PDP [ 2212.136181] Exit:dla_handle_events, ret:0 [ 2212.141594] Enter:dla_handle_events, processor:CDP [ 2212.147788] Exit:dla_handle_events, ret:0 [ 2212.153183] Enter:dla_handle_events, processor:RUBIK [ 2212.159561] Exit:dla_handle_events, ret:0

Dead here, can't keep running.

What could be the cause of this problem? Has anyone ever had the same problem? How to fix it.

embedeepLHY commented 5 years ago

If you are looking for a TPU design with DW CONV, you can try FREE-TPU here. (https://github.com/embedeep/Free-TPU)

zhouweiscut commented 5 years ago

Actually, i want a hardware design solution for depthwise conv, but there is no any hw reference for Free-TPU. Could you provide it ?

embedeepLHY commented 5 years ago

As I known, there is NO direct way to implement depthwise conv in NVDLA. Only "group" emulation as original caffe is available. We DO NOT have plan to release the source code of FREE-TPU for now. If we do, you will be the first one to know!

zhouweiscut commented 5 years ago

OK, thank you for your hint.