How to get bbox from attention map, does it predict width and height of target object?
And the locations obtrained from the bounding boes prediction, is it a small cornernet or anchor-based net?
And is the network for processing croped subimage a cornetnt?
How to get bbox from attention map, does it predict width and height of target object?
And the locations obtrained from the bounding boes prediction, is it a small cornernet or anchor-based net? And is the network for processing croped subimage a cornetnt?