Closed BostonLobster closed 3 years ago
Hi @BostonLobster , thanks for your question!
Yes, that is correct, the focal length and the principal point is different. If you check our code (e.g. the arange_pixels function), you will see that we assume the image plane to be in [-1, 1] with the center being at 0. The format used by Choy et. al. is [0, H-1] x [0, W-1] with the center point at H/2, W/2. (As a side note: we use the Choy et al. renderings only as input for the encoder, so that we never need to use the camera intrinsics / extrinsics in our repo.)
@m-niemeyer Thanks for your reply! I know the difference now. An additional question here: how to convert the format of Choy et. al. to yours where the image plane reside in [-1, 1]? I'm wondering if I use Choy et. al. rendering for both input and supervision, I have to modify the camera intrinsics.
I would suggest to do either of the following:
S = [ [s, 0, -1],
[0, s, -1],
[0, 0, 1]]
where s = (2 / (H - 1)); if H and W are different, you need to two different values, but this is not the case for Choy et al. (as you have squared images).
I'll try your suggestions! Many thanks!!
I downloaded the ShapeNet for 2.5D supervised models dataset, and found there are two
cameras.npz
. One in obj_ID folder, another inimg_choy2016
folder.In the paper, you wrote "While we use the renderings from Choy et al. [13] as input, we additionally render 24 images of resolution 2562 with depth maps and object masks per object which we use for supervision." So, I guess one
cameras.npz
is for your rendering, the other for choy's.But the focal length in two
cameras.npz
are different: In yours, the focal isbut in choy's, the focal is
I think the focal length should be same, because you just changed the camera pose during additional rendering, right?