liuyuan-pal / NeRO

[SIGGRAPH2023] NeRO: Neural Geometry and BRDF Reconstruction of Reflective Objects from Multiview Images
MIT License
551 stars 37 forks source link

NeRF_blender datasets #17

Closed Riga27527 closed 1 year ago

Riga27527 commented 1 year ago

Thank you for your impressive work! Have you tested under NeRF's synthetic dataset? For example the ShinyBlender dataset from Ref-NeRF. By the way, I'm still curious about how your work performs under diffuse-like objects? Any clues would be greatly appreciated!

liuyuan-pal commented 1 year ago

Hi, I don't test it on the NeRF synthetic dataset but I have tried it on the ShinyBlender dataset. It is able to work quite well on watertight objects, no matter whether they are diffuse-like or reflective, but has difficulty handling non-watertight objects because NeRO is based on SDF and SDF will struggle to reconstruct objects without closed shapes. As you can see, it is able to recostruct the non-reflective surfaces and their BRDF like the board in the kettle example in https://github.com/liuyuan-pal/NeRO/blob/main/assets/custom-result.png.

image image

Riga27527 commented 1 year ago

Hi, I don't test it on the NeRF synthetic dataset but I have tried it on the ShinyBlender dataset. It is able to work quite well on watertight objects, no matter whether they are diffuse-like or reflective, but has difficulty handling non-watertight objects because NeRO is based on SDF and SDF will struggle to reconstruct objects without closed shapes. As you can see, it is able to recostruct the non-reflective surfaces and their BRDF like the board in the kettle example in https://github.com/liuyuan-pal/NeRO/blob/main/assets/custom-result.png.

image image

Thank you for your wonderful reply! I still have some small implementation problems. Since NeRF_blender's near and far are 2 and 6 respectively, how much do i need to adjust the 'self.scale_factor' to load these datasets? In addition, should the judgment of 'inner_mask' be changed in functions such as 'compute_occ_loss' ?

liuyuan-pal commented 1 year ago

Hi, we only need to ensure the object is inside the sphere of 1 radius on the origin. For the shiny blender dataset, I use a scale factor of 0.7 for the helmet and the toaster. You may determine this factor using the rendered depth maps.

xdobetter commented 1 year ago

Thanks for your great work. Could you give me some pieces of advice on how to train in custom synthetic data (i.e., nerf synthetic data or ShinyBlender dataset)

Riga27527 commented 1 year ago

Thanks for your great work. Could you give me some pieces of advice on how to train in custom synthetic data (i.e., nerf synthetic data or ShinyBlender dataset)

You only need to replace the corresponding camera intrinsic matrix ( i.e., function get_K) and pose matrix (i.e., function get_pose) with the parameters of the NeRF synthetic data set (you can refer to NeRF's codes), and adjust the scale_factor (such as 0.5) to make the scene within the unit sphere.

g3956 commented 1 year ago

Hi Team,

Thanks for sharing the wonderful work and amazing codebase! I tried to load the ShinyBlneder dataset from NeRF's codes by replacing the corresponding camera intrinsic matrix and camera pose and adjusting the scale_factor to 0.5 to make sure the object within the unit sphere as follow:

image

but the result is like this:

image

It seems that the pose is not correct. Appreciate very much for any comment!

Thanks

liuyuan-pal commented 1 year ago

Hi, we adopt the world-to-camera pose here. pose=[R;t] and x_cam = R @ x_world + t. Note the pose inverse means we transfer the provided camera-to-world pose to the world-to-camera pose.

image

g3956 commented 1 year ago

Thanks for your help. It works now.

cmh1027 commented 9 months ago

@liuyuan-pal Do camera data in Glossy dataset contain w2c matrix, or c2w matrix? I'm struggling with reformatting glossy dataset into blender-like format

from glob import glob
import os
import numpy as np
import json
import pickle

def compute_fovx(K):
    fx = K[0, 0]  # Focal length in x-direction
    width = 2 * fx  # Sensor width in pixels (assuming square pixels)
    fovx = 2 * np.arctan(width / (2 * fx))  # Horizontal field of view
    return fovx

def convert_pose_to_extrinsic(w2c):
    c2w = np.linalg.inv(np.array(w2c))
    c2w = c2w @ np.diag([1,-1,-1,1]).astype(np.float32) # OpenCV -> OpenGL
    return c2w

for scene in sorted(glob("*")):
    if not os.path.isdir(scene): continue
    meta_train = {"camera_angle_x": None, "frames": []}
    meta_test = {"camera_angle_x": None, "frames": []}
    for idx, camera_path in enumerate(sorted(glob(os.path.join(scene, "*-camera.pkl")))):
        camera = pickle.load(open(camera_path, "rb")) # Extrinsic (w2c) OpenCV
        w2c, K = camera
        if idx == 0:
            fovx = compute_fovx(K)
            meta_train["camera_angle_x"] = fovx
            meta_test["camera_angle_x"] = fovx
        meta = meta_train if idx != 12 else meta_test

        image_path = "./" + os.path.basename(camera_path).replace("-camera.pkl", "-blend")
        w2c = w2c.tolist() + [[0., 0., 0., 1.]]
        meta['frames'].append(
            {
                "file_path": image_path,
                "transform_matrix": convert_pose_to_extrinsic(w2c).tolist() # Camera pose (c2w)
            },
        )
    json.dump(meta_train, open(os.path.join(scene, "transforms_train.json"), "w"))
    json.dump(meta_test, open(os.path.join(scene, "transforms_val.json"), "w"))
    json.dump(meta_test, open(os.path.join(scene, "transforms_test.json"), "w"))

This is my code, but if I train blender-formatted glossy dataset using nerf-syn branch, the quality doesn't good (I added alpha mask extracted from depth map to each image)