What's the output features shape for regression?

see-- / keras-centernet

A Keras implementation of CenterNet with pre-trained model (unofficial)

MIT License

337 stars 84 forks source link

What's the output features shape for regression? #10

Closed becauseofAI closed 5 years ago

becauseofAI commented 5 years ago

Which of the following tensor shape of output features is correct? Take batch=N, input =512 , R=4, C=80 (COCO ) as an example：（N，128，128，80，2，2）? or （N，128，128，80 + 2 + 2) ?

https://github.com/xingyizhou/CenterNet/issues/196

see-- commented 5 years ago

Both are wrong. Read the code or the paper. You have 3 outputs. Centers are （N，128，128，80).

becauseofAI commented 5 years ago

@see-- Centers are （N，128，128，80) W and H of Centers are（N，128，128，2) Offset of Centers are（N，128，128，2)

So only one target box can be predicted for the same or different categories with overlapping centers?

For example, a cat and an elephant coincide at the center, but their sizes vary greatly. But the center overlap of a cat and an elephant can only predict one category of elephant or cat, but can not simultaneously predict elephant and cat?

see-- commented 5 years ago

The paper has some great sections answering your questions. In short: You are right. But the important part is that it rarely happens. You have much fewer collisions/lost boxes with CenterNet than with any other approach.