Zzh-tju / CIoU

Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression (AAAI 2020)
GNU General Public License v3.0
319 stars 44 forks source link

Cluster-NMS #13

Open buttercutter opened 3 years ago

buttercutter commented 3 years ago

I am trying to understand Cluster-NMS operations.

The mathematical proof seems a bit complicated to follow and comprehend.

  1. Why C1 does not change values ? In other words, why C1 == X ?

  2. How to obtain b1 ?

  3. Why is it Cn = E x X instead of Cn = E x Cn-1 ?

Zzh-tju commented 3 years ago
  1. Matrix C will change at every iteration unless vector b is unchanged.

  2. Vector b is obtained by calculating the column wise maximum on the matrix C and then binarizing. So b=(b1,b2,...,bn) is a 0,1 vector, where 1 denotes preservation and 0 denotes suppression.

  3. Vector b indicates the suppression results of NMS under a certain iteration. So, by left multipling a diagonal matrix E, it is equivalent to do row transformation on the matrix X. This will ignore those current suppressed boxes so that they will not have any effects on the other boxes. (note that X is original IoU matrix.)

Finally, we will get exactly the same results to Original NMS as long as vector b does not change any more.

buttercutter commented 3 years ago

This will ignore those current suppressed boxes so that they will not have any effects on the other boxes.

How exactly does left multiplying diagonal matrix E achieve this ?

Zzh-tju commented 3 years ago

For example, let b=[1 0 0 1 0].

In our paper, the matrix

E=
1 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 1 0
0 0 0 0 0

then do E×X.

In practice, we use

E=
1 1 1 1 1
0 0 0 0 0
0 0 0 0 0
1 1 1 1 1
0 0 0 0 0

then do element-wise multiplication with the upper triangular IoU matrix X.

buttercutter commented 3 years ago

Why the extra 1 inside the matrix in practice ?

and how do all those iterations converge to the original NMS result ?

Zzh-tju commented 3 years ago

A diagonal matrix left multiplies another is equivalent to do row transformation (by Higher Algebra). So in practice, I replace it with element-wise multiplication for simplicity. Because it's faster than matrix multiplication. As for why the result of Cluster-NMS is equal to that of Original NMS, a simple case is provided here https://github.com/Zzh-tju/CIoU#description-of-cluster-nms-and-its-usage

For mathematics detail, kindly refer to our paper.

buttercutter commented 3 years ago

So in practice, I replace it with element-wise multiplication for simplicity. Because it's faster than matrix multiplication.

I may had missed something, but how is this (matrix in practice) being element-wise multiplication compared to the matrix given in the paper ?

Zzh-tju commented 3 years ago

https://github.com/Zzh-tju/CIoU/blob/master/layers/functions/detection.py#L154-L155