chenzhi1992 / TensorRT-SSD

Use TensorRT API to implement Caffe-SSD, SSD(channel pruning), Mobilenet-SSD
251 stars 84 forks source link

about Forward_DetectionOutputLayer function #26

Open angiend opened 6 years ago

angiend commented 6 years ago

Hi: first ,thanks for your code samples,but i do not understand how to use "Forward_DetectionOutputLayer"? i see your comment "I removed the 'detection_out' layer, and calculated the final output through the mbox_loc, mbox_prior and mbox_conf layer output" but the "mbox_loc " plugin layer is use the "createConcatPlugin" ,how to get this layer's outputs? i think that it should rewrite interface for DetectionOut like "class DetectionOut : public IPlugin" and in the enqueue(int batchSize, const voidconst inputs, void** outputs) function to get the input as mbox_loc,how do you think about it ? @chenzhi1992

chenzhi1992 commented 6 years ago

Hi, You can use TensorRT3.0 API (createSSDDetectionOutputPlugin) directly, and you can find it at NvInferPlugin.h.

chandrakantkhandelwal commented 6 years ago

@chenzhi1992 I tried the code and able to run it on X86_64 machine, for model trained on VOC dataset and also model trained on custom dataset (classes less than 21). However when I try it on TX1, I am able to run the pretrained SSD-VOC model, but not the model trained on custom dataset (classes different than voc classes). I get output as zeros. It would be great if you could provide some insight. Thanks!

paghdv commented 6 years ago

Hi @chenzhi1992 have you tried using it with a number of classes different to 21? I'm working with one class and one background (2 in total) but all I get is: virtual void nvinfer1::plugin::DetectionOutput::configure(const nvinfer1::Dims*, int, const nvinfer1::Dims*, int, int): Assertion 'numPriors*numLocClasses*4 == inputDims[param.inputOrder[0]].d[0]' failed. when calling: ICudaEngine* engine = builder->buildCudaEngine(*network); Do you have any idea on why this might be happening?

chenzhi1992 commented 6 years ago

The dimension of mbox_priorbox is wrong, and you can check it again. @paghdv

paghdv commented 6 years ago

I have exactly the same dimension that I have in caffe: (1,2,122656,1) (NCHW) Is it supposed to be something else? @chenzhi1992

myih commented 6 years ago

@paghdv @chenzhi1992 Did you solve this problem? I have the same error Assertion 'numPriors*numLocClasses*4 == inputDims[param.inputOrder[0]].d[0]' failed. I've already ran TensorRT-SSD successfully and now trying to run Mobilenet-SSD with TensorRT3. Thank you.

paghdv commented 6 years ago

@myih I ended-up just implementing detection_out. Mobilenet-SSD runs but you have to implement the depthwise conv layer yourself to see any speedups otherwise uses the default (loop through groups) method in TensorRT that is very slow.

myih commented 6 years ago

@paghdv Thanks for the fast reply. I'm able to use the createSSDDetectionOutputPlugin to run VGG-SSD so maybe it's not the detection-layer's problem? What version of TensorRT and cudnn are you using? I wrote some depthwise convolution code because like you said the group convolution was slow before, but I saw someone saying that the latest cudnn7 patch will support depthwise convolution. Thank you.

myih commented 6 years ago

@paghdv I solve the error by updating to TensorRT4 and specify inputOrder = {0, 1, 2}, but the detection output result is incorrect. Were you able to get correct output when using group convolution and not your own depthwise convolution? Thank you!

zbw4034 commented 6 years ago

@myih Hi!I am using tensorrt4.0 and I met the same question:virtual void nvinfer1::plugin::DetectionOutput::configure(const nvinfer1::Dims*, int, const nvinfer1::Dims*, int, int): Assertion numPriors*numLocClasses*4 == inputDims[param.inputOrder[0]].d[0]' failed

how did you solve it?Could you show me more details?

paghdv commented 6 years ago

Yes, I do get the correct result with grouped convolutions.

On Mon, Aug 13, 2018, 6:16 PM myih notifications@github.com wrote:

@paghdv https://github.com/paghdv I solve the error by updating to TensorRT4 and specify inputOrder = {0, 1, 2}, but the detection output result is incorrect. Were you able to get correct output when using group convolution and not your own depthwise convolution? Thank you!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/chenzhi1992/TensorRT-SSD/issues/26#issuecomment-412720150, or mute the thread https://github.com/notifications/unsubscribe-auth/ADhxI7USbt-1OiC2SJ6YHZi-xWkb5UUtks5uQiT3gaJpZM4UFmEs .

-- AG

myih commented 6 years ago

@paghdv Thanks for the reply.

@zbw4034 What model and weights are you trying to run? I solve this problem after setting inputOrder = {0, 1, 2} in createSSDDetectionOutputPlugin, also you might have some incorrect layer names or duplicates.

paghdv commented 6 years ago

@myih did you implement your own softmax?

myih commented 6 years ago

@paghdv I use this implementation Teoge/tensorrt-ssd-easy , it works for vgg-ssd and I checked the cuda code seems legit.

myih commented 6 years ago

Damn I totally forgot about Weiliu and chuanqi305 use different image loader... Now it can run on 1080ti with ~150fps (without depthwise convolution implemented, is it normal?)

Ghustwb commented 5 years ago

@paghdv Thanks for the reply.

@zbw4034 What model and weights are you trying to run? I solve this problem after setting inputOrder = {0, 1, 2} in createSSDDetectionOutputPlugin, also you might have some incorrect layer names or duplicates.

what is the meaning of "setting inputOrder = {0, 1, 2} in createSSDDetectionOutputPlugin"? could you describe is in detail? Please

myih commented 5 years ago

@Ghustwb In TensorRT4 the API allows you to specify the order/layout of the input, not in TensorRT3 though You can find description of the API here

Ghustwb commented 5 years ago

@Ghustwb In TensorRT4 the API allows you to specify the order/layout of the input, not in TensorRT3 though You can find description of the API here

Thanks for your reply,I tried tensorRT4.0,use "setting inputOrder = {0, 1, 2} in createSSDDetectionOutputPlugin",the Assertion error still exists

virtual void nvinfer1::plugin::DetectionOutput::configure(const nvinfer1::Dims, int, const nvinfer1::Dims, int, int): Assertion numPriorsnumLocClasses4 == inputDims[param.inputOrder[0]].d[0]' failed

myih commented 5 years ago

@Ghustwb Maybe try working with another repo like this one And check the plugin layer name carefully, when I worked on mobilenetSSD I was sure I had them correctly for some days before I found the typo...

Ghustwb commented 5 years ago

@Ghustwb Maybe try working with another repo like this one And check the plugin layer name carefully, when I worked on mobilenetSSD I was sure I had them correctly for some days before I found the typo...

Thank you for your help, mobileNet-SSD runs successfully. But the detection result is incorrect. I printed the detection result, the Obj result contains 7 values, such as 0 3 0.75 0.24 0.35 0.25 0.35 3 ---> class index 0.75 ----> confidence 0.24-----> xmin 0.35-----> ymin 0.25-----> xmas 0.35-----> ymax The value of confidence and class index seems to be correct, but the box value is obviously wrong.

Can you help me? Thanks