facebookresearch / ava-256

Train universal codec avatars
Other
94 stars 4 forks source link

about camera parameters #11

Open bbakpil opened 2 months ago

bbakpil commented 2 months ago

Hello, thanks for your wonderful work.

I've got some questions about the dataset.

Which file includes the extrinsic parameters (W2C), and the intrinsic parameters?

I found head pose folder and camera_calibration.json file.

(1) For the intrinsic, "K" in camera_calibration.json file looks like intrinsic params, but the principal points look different. (2) For the extrinsic, looks like .txt files in head pose are them, but I found that there is also "T" in the camera_calibration.json file.

Can you tell me which are the extrinsic/intrinsic camera parameters?

Thank you in advance!

una-dinosauria commented 2 months ago

The best thing to do is probably look at the camera parsing code and reverse-engineer that: https://github.com/facebookresearch/ava-256/blob/c0bcc353f681e6eb7796eaa9771a8630ddc7b5bf/utils.py#L142-L175

extrin and intrin will come out of that dictionary.

bbakpil commented 2 months ago

Hello, thanks for your comment.

You mean that the head_pose and camera_calibration.json are not the extrinsic and intrinsic parameters? So if I run that code, I can extract the intrin and extrin? (looks like it just returns the intrin and extrin dictionary. Not sure what "reverse-engeineer" means.

una-dinosauria commented 2 months ago

Oh sorry!

Both are needed for avatar modelling.

Cheers,

bbakpil commented 2 months ago

Thanks for your kind reply!

(1) So the "K"s in camera_calibration.json are the intrinsic params? (2) I've downloaded the 4TB version, so the image resolution is 1024x667,. When I checked the intrinsic params "K" in camera_calibration.json, the principal points include bigger than the images resolution. Maybe "K" is based on the 32TB version? Do I need to normalize them to use for 1024x667? (3) As far as I know, extrinsic parameters are used to transform coordinates from world coordinates to camera coordinates (e.g., from frontal view to camera view). So, where is the head pose used? The values seem like typical extrinsic parameters, so I'm a bit confused.

Thank you again for your kind response. Please reply when you have time!

una-dinosauria commented 2 months ago

(1) So the "K"s in camera_calibration.json are the intrinsic params?

Yes

(2) I've downloaded the 4TB version, so the image resolution is 1024x667,. When I checked the intrinsic params "K" in camera_calibration.json, the principal points include bigger than the images resolution. Maybe "K" is based on the 32TB version? Do I need to normalize them to use for 1024x667?

Yes, the parameters correspond to the original images, which are 4 times larger (we call this the downsample factor). To compensate for this transform, you have to divide the focal length and the principal point by this value. See https://github.com/facebookresearch/ava-256/blob/c0bcc353f681e6eb7796eaa9771a8630ddc7b5bf/data/ava_dataset.py#L241-L242

(3) As far as I know, extrinsic parameters are used to transform coordinates from world coordinates to camera coordinates (e.g., from frontal view to camera view). So, where is the head pose used?

The head pose is used to initialize the pose of the guide mesh (and thus, the volumetric primitives) used for reconstruction in our codebase. You may not need this information, depending on what modelling method you use.

bbakpil commented 2 months ago

That really helps. Thank you so much!

walsvid commented 2 months ago

Hi @una-dinosauria. Thanks for your great work. I'm curious about camera distortion. The dataset includes camera parameters with distortion in camera_calibration.json, but it seems that these parameters are not being used when the dataset is read in ava-256/data/ava_dataset.py. I wonder if the images have already been undistorted.

una-dinosauria commented 2 months ago

@walsvid You are right, the images have already been undistorted before hand.

walsvid commented 2 months ago

@walsvid You are right, the images have already been undistorted before hand.

Thank you, @una-dinosauria, for your reply and clarification.