ceccocats / tkDNN

Deep neural network library and toolkit to do high performace inference on NVIDIA jetson platforms
GNU General Public License v2.0
717 stars 209 forks source link

Bug: incorrect output dimensions in yolo layer #60

Open mrhosseini opened 4 years ago

mrhosseini commented 4 years ago

In yolov3_tiny network dimensions of the first yolo layer is 13 x 13 x 255 and dimensions of the second one is 26 x 26 x 255. However, if we use a network configuration in which unlike yolov3_tiny the dimensions of the first yolo layer is bigger than the second one (like this), this line:

checkCuda( cudaMemcpy(predictions, dstData, output_dim.tot()*sizeof(dnnType), cudaMemcpyDeviceToHost));

will fail with the error:

Cuda failure: invalid argument

The problem is that output_dim of the layer is different from the dimensions of the dstData and with debugging it can be found that for the first yolo layer the output_dim of it equals to the second one and vice versa. (Or we can say that the dimensions of dstData belongs the other layer).

I'm not sure if other parameters of the layer has the same problem.

mrhosseini commented 4 years ago

After more debugging I found that, order of yolo layers in pluginFactory of NetworkRT after desrialization (here), is different from what we have in NetworkRT::buffersRT which is based on engineRT->getBindingDimensions() (here).

mrhosseini commented 4 years ago

Changing this line to the following will solve the problem. But I am not sure if this works for all the cases.

rt_out[i] = (dnnType *) netRT->buffersRT[netRT->pluginFactory->n_yolos - i] + netRT->buffersDIM[netRT->pluginFactory->n_yolos - i].tot() * bi;
ceccocats commented 4 years ago

Hi, The order of buffersRt should always be: Input Output 0 Output 1 Etc.

If this really happen I need to check

mrhosseini commented 4 years ago

The order of buffersRt should always be: Input Output 0 Output 1 Etc.

What happens is that order of yolo layers in NetworkRT::pluginFactory is different from bufferRT. Example for 2 yolo layers :

Input
Output 0 [26 x 26 x 18]
Output 1 [13 x 13 x 18]

in bufferRT but in pluginFactory:

Yolo 0 [13 x 13 x 18]
Yolo 1 [26 x 26 x 18]