see-- / keras-centernet

A Keras implementation of CenterNet with pre-trained model (unofficial)
MIT License
337 stars 84 forks source link

What's the output features shape for regression? #10

Closed becauseofAI closed 5 years ago

becauseofAI commented 5 years ago

Which of the following tensor shape of output features is correct? Take batch=N, input =512 , R=4, C=80 (COCO ) as an example: (N,128,128,80,2,2)? or (N,128,128,80 + 2 + 2) ?

https://github.com/xingyizhou/CenterNet/issues/196

see-- commented 5 years ago

Both are wrong. Read the code or the paper. You have 3 outputs. Centers are (N,128,128,80).

becauseofAI commented 5 years ago

@see-- Centers are (N,128,128,80) W and H of Centers are(N,128,128,2) Offset of Centers are(N,128,128,2)

So only one target box can be predicted for the same or different categories with overlapping centers?

For example, a cat and an elephant coincide at the center, but their sizes vary greatly. But the center overlap of a cat and an elephant can only predict one category of elephant or cat, but can not simultaneously predict elephant and cat?

see-- commented 5 years ago

The paper has some great sections answering your questions. In short: You are right. But the important part is that it rarely happens. You have much fewer collisions/lost boxes with CenterNet than with any other approach.