How to get more detail about the camera pose?

xufengfan96 commented 2 years ago

Hello author,

Thank you for sharing your code. I want to capture the corresponding camera pose by taking images of a circle. For example, I take a picture every 10 degrees. After I train the network, I find the results in logs are some images and some .tar file. Can I get some information about the camera pose like a 4x4 matrix?

Looking forward to your reply.

jeongyw12382 commented 2 years ago

Thanks for taking an interest in our work.

The estimated camera poses are stored in ".tar" file, the ckpt file with the standard torch format.

If you take a look at the key "camera_model", you could find all the estimated camera parameters. To be precise, you are able to find the camera parameters by the script below.

torch.load([path to tar file])["camera_model"]

I might guess you would have difficulty loading the estimated camera poses since it is divided into two parameters: "extrinsics_noise" and "extrinsics_init". As described in our main paper, we use a different camera pose estimation, 6D representation. Thus, I recommend referring to the code here.

The code to convert 6D representations to rotation matrices. https://github.com/POSTECH-CVLab/SCNeRF/blob/dc57e9f6e763284a12bed4812e6945f49ee0ef5e/model/camera_utils.py#L78
The code loading trained model https://github.com/POSTECH-CVLab/SCNeRF/blob/dc57e9f6e763284a12bed4812e6945f49ee0ef5e/NeRF/create_nerf.py#L172

To simply extract the file, here is the easiest way to checkout camera poses.

import torch
pretrained = torch.load([path to pretrained])
camera_model = pretrained["camera_model"]
rot_6D_trans_3D = camera_model["extrinsics_initial"] + camera_model["extrinsics_noise"] * <your noise scale> 
ortho2rotation(rot_6D_trans_3D[:, :6]) -> rotation matrices
rot_6D_trans_3D[:, -3:] -> translation vectors [edited]

[path to pretrained] = path to the .tar file you are planning to open. [your noise scale] = the extrinsic noise scale you have used during training.

If this reply was not sufficient for you, then let me know what problem you are suffering from.

Again, thanks for taking an interest in our work.

jeongyw12382 commented 2 years ago

If you have any difficulty extracting camera poses, please reopen the issue. I would provide more detailed descriptions for it.

xufengfan96 commented 2 years ago

Thank you very much for your reply. It helps me a lot.

xufengfan96 commented 2 years ago

Hello author,

In this way, I can get the 4x4 matrice about the camera pose. I notice in your code that the translation vectors consist of the first three numbers of rot_6D_trans_3D. And I find the translation vector is a part of rotations matrice, like this: 1636195400(1) So is that a mistake? Maybe the translation vector should be the last three numbers of rot_6D_trans_3D.(The shape of rot_6D_trans_3D is batch * 9)

Another question is that I notice in your paper you initialize all the rotation matrices, the translation vectors, and focal lengths to an identity matrix, zero vector, and height and width of the captured images. And your code also have a file named load_blender.py. If I want to estimate camera pose of my own images by rgbd camera. Can I choose 'blender' dataset_type? (I think the Blender dataset is closer to the RGBD camera collection)

jeongyw12382 commented 2 years ago

Oh, sorry for being late. As you pointed out, the last three elements in the ckpt file are translation vectors. Sorry for the mistake above. I've just fixed it.

The reason why we include the 'blender' dataset is that the very first version of our paper included noise injection ablation studies. For clearer paper writing, we have removed such experiments. In these experiments, we have added synthetic noises on camera parameters to verify that our model recovers the noises. I could find a few traces of experiments we have done in this code. If you ever want to use the scenes in the blender dataset, you could use it but I cannot ensure whether this code still perfectly works.

xufengfan96 commented 2 years ago

I got it. And another question is that I find the date include a pose_bounds.npy and a simplices.npy. I read the file. It show me some matrices (The shape of matrices about pose_bounds and simplices are n 17 and n 3). I use colmap to process my data and get the result in folder 'sparse' and database.db, but I don't know the meaning about the two .npy files and how to get it? In addition, I have sent an email to you about this question. If you answered here, please ignore the email.

Finally, thank you for your reply.

jeongyw12382 commented 2 years ago

For the files named "poses_bounds.npy", I strongly recommend referring to the codes below, which is the process of extracting bounded poses. It might help you.

https://github.com/weiyithu/NerfingMVS/blob/8c8f96244146b929a1495caf2719c090b48ac082/utils/pose_utils.py#L72 https://github.com/Fyusion/LLFF/blob/master/llff/poses/pose_utils.py

I'm not really sure what you are targeting so, I'll assume that you are trying to render your custom scenes with our code. Based on my understanding, our code does not use simplices. Thus, could you check whether the file "simplices.npy" is necessary for your code?

jeongyw12382 commented 2 years ago

The article here(https://github-wiki-see.page/m/pelednoam/scatter_3d_to_surface/wiki/Preprocessing) could help you understand what "slices.npy" exactly is. I hope this link helps you.

xufengfan96 commented 2 years ago

Thank you for your reply. It works now. I'm looking forward to the final result.

jeongyw12382 commented 2 years ago

Glad to hear the good news! :+1:

POSTECH-CVLab / SCNeRF

How to get more detail about the camera pose? #4