xingyizhou / CenterNet

Object detection, 3D detection, and pose estimation using center point detection:
MIT License
7.29k stars 1.93k forks source link

What's the output features shape for regression? #196

Open becauseofAI opened 5 years ago

becauseofAI commented 5 years ago

Which of the following tensor shape of output features is correct? Take batch=N, input =512 , R=4, C=80 (COCO ) as an example: (N,128,128,80,2,2)? or (N,128,128,80 + 2 + 2) ?

becauseofAI commented 5 years ago

Centers are (N,128,128,80) W and H of Centers are(N,128,128,2) Offset of Centers are(N,128,128,2)

So only one target box can be predicted for the same or different categories with overlapping centers?

For example, a cat and an elephant coincide at the center, but their sizes vary greatly. But the center overlap of a cat and an elephant can only predict one category of elephant or cat, but can not simultaneously predict elephant and cat?

https://github.com/see--/keras-centernet/issues/10