Zzh-tju / CIoU

Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression (AAAI 2020)
GNU General Public License v3.0
319 stars 44 forks source link

Request for a basic documentation for NMS inputs #4

Closed kemaloksuz closed 4 years ago

kemaloksuz commented 4 years ago

Hello,

Thanks for the code of your approach. I am interested in class-specific versions of Cluster NMS and planning to adapt your code into my detection pipeline. So, I need a very basic documentation for the inputs of NMS functions. Excluding the hyperparameters, which seems obvious, the inputs are 1-boxes 2-masks 3-scores

Firstly, I am working on the detection domain, so I think I can safely ignore masks. Am I correct?

Secondly can you please provide information about types, sizes and a short description of the inputs: boxes and scores?

Many thanks.

Kemal

Zzh-tju commented 4 years ago

This repo is based on YOLACT, the inputs of boxes have the size of [n,4]. And scores is [80,n]. The top k =200, so boxes will be [80,200,4] or [80,m,4] (where m<200). Then calculating IoU matrix by using jaccard funtion.

You can ignore the mask stuff.

As you can see in the above method, the number of boxes are equal among all the classes. It is controlled by top k.

Another different approach is that the number of boxes in all classes is unequal. The inputs of boxes and scores are the same. But we use a score threshold (e.g. >=0.01) to filter out most low score detection boxes. This process results in the number of remaining boxes in different classes may be different.

Then put all the boxes together and sorted by score descending. For example, there are 102 boxes. After score thresholding, there are 10 boxes left. AABCCBCAAB where each letter represents a box belonging to the same class.

Then add offset for each box by using torch.arange(0,3). As you can see there are 3 classes in this example. So if a box belongs to A, the offset is 0. If a box belongs to C, the offset is 2.

You know the coordinates (x1,y1,x2,y2) of all the boxes are on interval (0,1). By adding offset, if a box belongs to class 61, its coordinates will on (60,61). Then calculating IoU matrix, the IoU of boxes belonging to different classes will be 0. This utilizes the scale invariance of IoU. For this method, you can refer to another our repo https://github.com/Zzh-tju/DIoU-SSD-pytorch/blob/master/utils/detection/detection.py

kemaloksuz commented 4 years ago

OK thanks for your detailed explanation, I believe this is more than enough.