Find a better way to have tighter localization bboxes. The basic idea are:
Never loose a caffe above-threshold detection. This implies a simple average like we have done up to now won't work.
For pixels that have multiple above-threshold detections across sliding windows should have higher score
Pixels in edges and corners don't get visited that often, so a above-threshold detection should count for more
Each logo appearance should be analyzed independently. Thus, if a frame has two logos detected in a particular scale, the bbox of one should NOT be influenced by bbox of another. Similarly, if the frame has the same logo detected at different scales, the bbox made in one scale should not influence bbox made in another.
Rather than write formula that satisifies all conditions at once, we used multiple filters (sigmoid in our case) to approximate what we wanted. The filters need to be tuned along with scale/stride changes.
Find a better way to have tighter localization bboxes. The basic idea are:
Rather than write formula that satisifies all conditions at once, we used multiple filters (sigmoid in our case) to approximate what we wanted. The filters need to be tuned along with scale/stride changes.