linghu8812 / tensorrt_inference

702 stars 206 forks source link

When converting to onnx, can the output be divided into information about classes and information about boxes? #46

Open leeyunhome opened 3 years ago

leeyunhome commented 3 years ago

Hello,

Thanks to you, I am looking at good data. image When converting to onnx, can it be divided into class information and boxes information in the form shown on the left? If possible, can you tell me how?

Thank you.

leeyunhome commented 3 years ago

Hello,

Thanks to you, I am looking at good data. image When converting to onnx, can it be divided into class information and boxes information in the form shown on the left? If possible, can you tell me how?

Thank you.

Hello, @linghu8812

image 107 seems to be my number of classes, 102 + 5 How did the 25200 in the middle come from?

Thank you.

linghu8812 commented 3 years ago

((640 / 8) (640 / 8) + (640 / 16) (640 / 16) + (640 / 32) (640 / 32)) 3

leeyunhome commented 3 years ago

Hello,

image

By changing the parameters of the torch.onnx.export function, I got classes and boxes as follows.

Why can the dimensions of classes and boxes explain this?

And when inferring by loading this onnx in tensorRT, can we ignore feature maps like 660 and 969? Or do I have to handle it separately?

Thank you.

leeyunhome commented 3 years ago

((640 / 8) (640 / 8) + (640 / 16) (640 / 16) + (640 / 32) (640 / 32)) 3

Hello,

I don't understand this part, can you explain it in words? I don't understand the part where 1, 3, 640, 640 tensors are entered and 1, 25200, 107 appear as outputs.

Thank you.

leeyunhome commented 3 years ago

((640 / 8) (640 / 8) + (640 / 16) (640 / 16) + (640 / 32) (640 / 32)) 3

I haven't understood this explanation yet. Can you explain?

linghu8812 commented 3 years ago

@leeyunhome there are three feature maps output from the net, the 640 / 8 means the width or height of the output feature map.

leeyunhome commented 3 years ago

@leeyunhome there are three feature maps output from the net, the 640 / 8 means the width or height of the output feature map.

Hello

8, 16, 32 aren't stride sizes?

I still don't understand well.

Could you please let me know if there is any material to read to understand this? Thank you.

sakulh commented 3 years ago

Hi, I have similar problem with Scaled Yolov4. I run model as tensorrt in Triton Inference Server and output of the model is with shape: [1, 65856, 85].

platform: "tensorrt_plan"                                               
max_batch_size: 1                                                                             
input {                                                            
  name: "images"                                                                          
  data_type: TYPE_FP32                                                   
  dims: 3                                                                                            
  dims: 896                                                                                  
  dims: 896                                                                                                                                                                              
}                                                                                                                                                                                        
output {                                                                                                                                                                                 
  name: "output"                                                                                                                                                                         
  data_type: TYPE_FP32                                                                                                                                                                   
  dims: 65856                                                                                                                                                                            
  dims: 85                                                                                                                                                                               
}                                                                                                                                                                                        
default_model_filename: "model.plan"

How can i get boxes and classes? Thank you.

bobbilichandu commented 3 years ago

@leeyunhome yes 8,16,32 are strides. Go through the paper rather than asking doubts here. The issues must be mostly related to the code that's implemented. If you have any doubts regarding the logic of that implementation, I suggest you read papers or other blogs regarding scaledyolov4. ((640 / 8) (640 / 8) + (640 / 16) (640 / 16) + (640 / 32) (640 / 32)) 3 for each stride, the grid cell is created from the input and you get 3 outputs(3 anchor boxes) for cell in that grid. You can watch Aladdin Persson's yolov1 video to understand what grid cell is and read yolov2 to understand what anchor boxes are and finally if you have a doubt regarding the concatenation of all these outputs, you definitely have to read scaledyolov4 paper.