TemugeB / CDRnet

Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled Representation - Tensorflow
MIT License
46 stars 8 forks source link

why downsample? #4

Open guker opened 1 year ago

guker commented 1 year ago

https://github.com/TemugeB/CDRnet/blob/c130864902083ae2d992d53b56ba15907b23d993/train_and_test.py#L109 why use resized_mat@pms_v1?

TemugeB commented 1 year ago

I haven't looked at this repository in a while so I don't remember all the details. But the main reason is the input image is 256x256 pixels while the predicted heatmaps is 64x64. So when you downsample the image, the intrinsic matrix of the camera must also be modified. n = -2 here actually means 1/4 rescale. The intrinsic matrix can be modified by just multiplying the projection matrix from the left. For more details, look here: link

guker commented 1 year ago

I got it, thanks