zczcwh / POTTER

The project is an official implementation of our paper "POTTER: Pooling Attention Transformer for Efficient Human Mesh Recovery".
44 stars 1 forks source link

Comparison with FastMetro #2

Closed moonsh closed 1 year ago

moonsh commented 1 year ago

Just curious. I found that the parameter numbers from the table in your paper for FastMetro included CNN backbone parameters. However, your network excludes detector parameters.

zczcwh commented 1 year ago

In FastMETRO, they use a CNN backbone to extract features and then apply their transformer architecture.

In our POTTER, we use the POTTER_cls as our backbone (12M params), then add the HR stream part and the HMR head part. The total Params is reported in Table 2, which is 16M (12M from the backbone and 4M from the rest).

moonsh commented 1 year ago

I see. What about the detector model? Faster RCNN?

zczcwh commented 1 year ago

Yes, right now we use the Faster RCNN in our inference demo code.

moonsh commented 1 year ago

I got it. Thank you!