Open Remember2018 opened 5 years ago
@Remember2018 I'm curious about this too. The underlying algorithm seems first to merge quadrangles above the NMS thresh:
https://github.com/argman/EAST/blob/ab97939783901b7e22ff55e151964e159d1627b9/lanms/lanms.h#L220-L224
This has the effect of weighting the final polygon coordinates by the input score, e.g.,
https://github.com/argman/EAST/blob/ab97939783901b7e22ff55e151964e159d1627b9/lanms/lanms.h#L66
and then accumulating the total score of the merged polygons
https://github.com/argman/EAST/blob/ab97939783901b7e22ff55e151964e159d1627b9/lanms/lanms.h#L78
The subsequent get
ter calculates the inverse total score so that the weighted coordinates come out correctly:
https://github.com/argman/EAST/blob/ab97939783901b7e22ff55e151964e159d1627b9/lanms/lanms.h#L146-L147
But the resulting polygon does not have the score averaged in accordance with the number of polygons that are merged:
https://github.com/argman/EAST/blob/ab97939783901b7e22ff55e151964e159d1627b9/lanms/lanms.h#L157
One "solution" to this problem could be as simple as changing that line to
p.score = score / nr_polys;
"Solution" being to retain the intuitive notion of the score as a value in (0,1). Otherwise what you're seeing is a conflation of the number of polygons merged and their scores.
The reason not to do that has more to do with semantic correctness than with the utility of the calculation. Only two polygons get merged at one time, so the score would be a repeated averaging of a new polygon with an existing averaged polygon. That may be correct (particularly for this algorithm), but it's not the same thing as accumulating a big set of merged polygons and then taking the average score. Put another way, if we have three scores a
, b
, and c
, which are merged first by a
and b
, to get score d
and then that result is merged with c
, we'd have
( (a + b) / 2 + c ) / 2 ) == a/4 + b/4 + c/2
which is of course not the same as (a + b + c)/3
.
This is all as described in the EAST paper §3.6–in particular that V(a) = V(g)+V(p), which is what you're seeing.
The absence of score normalization in the PolyMerger
class get
ter is exactly what makes this incremental merging to calculate new polygon coordinates work out to be more like (a + b + c)/3
than a/4 + b/4 + c/2
. That strikes me as desirable.
However, I'm not sure what it means that this combined score V(a) is later used in the standard NMS. Again, what troubles me about it is that the score V(a) conflates the number of rectangles and their underlying scores.
I suppose a "right" thing to do would be to add a score normalization term (essentially nr_polys
as in the PolyMerger
class) to the struct Polygon
and then apply a score normalization transformation to the polygons before running the standard NMS. (This would also have the finally beneficial side-effect of returning intuitively meaningful scores.)
Hi, @weinman Prof. Weinman, thanks a lot for your detail analysis on this problem! Especially the dynamic procedure of boxes merging.
I'm working on deploying my own fix to this issue; I'll push it to a new branch in my own fork when it's done.
I pushed an updated version of lanms.h
to this fork. It produces the average the average score of the merged rectangles, rather than the summed scores, so the range remains in [0,1].
Hi Prof. Weinman, thanks a lot for your kind commitment and great improvement!
Hi, has anybody checked this line?
https://github.com/argman/EAST/blob/ab97939783901b7e22ff55e151964e159d1627b9/eval.py#L98
I found that the box scores before and after the lanms are different in range. Suppose the scores before the lanms are [0.81, 0.90, 0.82, ...], the scores after the lanms are [10.0, 14.5, 9.8, ...]. and so on.
Is this reasonable? Thanks.