Kai-46 / IRON

Inverse rendering by optimizing neural SDF and materials from photometric images
BSD 2-Clause "Simplified" License
294 stars 24 forks source link

How to train/test on my own dataset #15

Open changfali opened 2 years ago

changfali commented 2 years ago

Hi, I want to try IRON on my own datasets, but I don't know how to get cam_dict_norm.json, could you share with me this part of code? Thanks!

changfali commented 2 years ago

Hi, I used colmap to get cam_dict_norm.json. The results in exp_iron_stage1/ours/validations_fine/ are good( after 30000 iters) image

, but results in exp_iron_stage1/ours/normals are not good( after 30000 iters): image

Could it was because I didn't do any things to make sure that:"Note we also assume the objects are inside the unit sphere. " as you said in readme?

changfali commented 2 years ago

Hi Kai, I used the code "run_colmap.py" in Nerf++ to produce the cam_dict_norm.json, but the result is also bad, even for your dataset. I used all images in train and test of "Xmen" to run the code "run_colmap.py" in Nerf++, then I run "camera_visualizer\visualize_cameras.py" in Nerf++ to visualize the cameras, here is the result, the first picture the cameras result of "Xmen" given by "run_colmap.py" (the json file is xmen/posed_images/kai_cameras_normalized.json), and the second is the cameras in the original dataset of "Xmen" : iRxzQsGcYc d5bRRBLxax you can see the deference. After that I splited the images in the dir "xmen/mvs/image"(this is the result of "run_colmap.py") to train and test by myself, and then I used them to train Iron. But after 15000 iters, the result of normal.png was still the same as the beginning: image and the result of mesh has 0 Vertices,0 Faces: pCuc5FDW9q while for the original data, the result after 2500 iters is already good: image

Do you know which part could be the reason? I have been troubled by this problem for a long time, and it would be a GREAT HELP for me!! THANKS!! @Kai-46

Kai-46 commented 2 years ago

Hi @changfali , I'm sorry that you have spent a lot of time on this; camera conventions are always a pain when you work in this domain. It takes time to learn.

To help you, I'd like to ask a few questions in order to make sure we are on the same page:

1) Did you happen to notice that colmap SfM outputs camera parameters as well as a sparse point cloud: https://github.com/Kai-46/nerfplusplus/blob/ebf2f3e75fd6c5dfc8c9d0b533800daaf17bd95f/colmap_runner/extract_sfm.py#L107 ? And you can visualize the point cloud together with the cameras?

2) Did you happen to understand how camera normalization (as well as point cloud normalization, and that they need to be done together) work?

3) Did you happen to understand the technique of debugging poses with epipolar geometry visualization?

Totally fine if you haven't thought about these questions. But this would be helpful for me to answer your questions based on your familiarity in this area.

Best, Kai

Michaelwhite34 commented 2 years ago

Also struggling to get cam_dict_norm.json

changfali commented 2 years ago

Hi @Kai-46 , Thanks for the reply! I know that colmap SfM outputs camera parameters as well as a sparse point cloud, and we can visualize the point cloud together with the cameras, but this link(https://github.com/Kai-46/nerfplusplus/blob/ebf2f3e75fd6c5dfc8c9d0b533800daaf17bd95f/colmap_runner/extract_sfm.py#L107) is not for point could, but for triangle mesh. So I visualized the camera and the sparse point cloud without normalization: right view: fnPxFR2DtH top view: iEuGeMxWFY for the camreas, the blue ones are the output of colmap(without normalization): ./xmen/posed_images/kai_cameras.json and the green ones are from the dataset you provided(with normalization). the sparse point cloud is from: ./xmen/posed_images/kai_points.ply

and here is the inspect_epipolar_geometry.py result: F0UAnUeL2n

Is this a bug of colmap or the script "run_colmap.py"?

I find that you have share the complete output of scripts on an example data here: https://github.com/Kai-46/nerfplusplus/issues/16, but the link can not be used anymore, could you please share the complete scripts for us to get the cam_dict_norm.json (with the input of our data)?

Thanks a lot!

best, changfa

Kai-46 commented 2 years ago
changfali commented 2 years ago

Hi, here are some others results: cameras (xmen/posed_images/kai_cameras_normalized.json) and colmap MVS result mesh (xmen/mvs/meshed_trim_3.ply with normalization using https://github.com/Kai-46/nerfplusplus/blob/ebf2f3e75fd6c5dfc8c9d0b533800daaf17bd95f/colmap_runner/extract_sfm.py#L107): image OnUz7i0v8K Zvl4cjfV9L you can see the face part is not right, and the cameras are all in front of the scene.

here are the inspect_epipolar_geometry.py results: gRZZGPeEhg aXYUKN9Syt o18pEC9DL7 CiUrUpLr49

The version of my colmap is 3.8, build from source following: https://colmap.github.io/install.html on Ubuntu 20.04.

changfali commented 2 years ago
  • Hi, I can try to share you a script later this week. But if you'd like to figure things out yourself, here are the steps you need starting from the unnormalized cameras and sparse point cloud from Colmap: 1) remove the extreme outliers in the sparse point cloud using meshlab; 2) then compute the oriented bounding box for the sparse point cloud using open3d; 3) next, compute a translation vector and scale scalar to move the center of the bounding box to the origin, and diagonal of the bounding box to something smaller than 2, say 1.75. 4) use the translation vector and scale scalar to normalize the cameras and the point cloud; after this normalization, the point cloud (hence the object) will be inside the unit sphere.

Hi@Kai-46, did you finish the script? I have no idea which step could be wrong for amera conventions..

Kai-46 commented 2 years ago

Here it is. I hope the script is self-explanatory. Please keep in mind that the normalization must be applied to the point cloud and cameras at the same time (might worth thinking why this is the case from your side); and when you inspect the epipolar geometry, try picking pairs of images that overlap in content rather than a front image and a back image (might worth thinking about the logic too).

import numpy as np
import json
import copy
import open3d as o3d

def normalize_cam_dict(
    in_cam_dict_file, out_cam_dict_file, target_radius, in_geometry_file, out_geometry_file
):
    # estimate a translate and scale that centers the objects from the sparse point cloud
    #! note if your sparse point cloud contains extreme outliers, you should remove the outliers in meshlab first,
    #!    ohterwise, the estimated bounding box is going to be too big.
    pcd = o3d.io.read_point_cloud(in_geometry_file)
    box = pcd.get_oriented_bounding_box()
    box_corners = np.asarray(box.get_box_points())  # [8, 3]
    box_center = np.mean(box_corners, axis=0, keepdims=True)  # [1, 3]
    dist = np.linalg.norm(box_corners - box_center, axis=1, keepdims=True)  # [8, 1]
    diagonal = np.max(dist) * 2.
    translate = -box_center.reshape((3, 1))
    scale = target_radius / (diagonal / 2.)

    # apply translate and scale to the sparse point cloud
    tf_translate = np.eye(4)
    tf_translate[:3, 3:4] = translate
    tf_scale = np.eye(4)
    tf_scale[:3, :3] *= scale
    tf = np.matmul(tf_scale, tf_translate)

    pcd_norm = pcd.transform(tf)
    o3d.io.write_point_cloud(out_geometry_file, pcd_norm)

    # apply translate and scale to the cameras
    with open(in_cam_dict_file) as fp:
        in_cam_dict = json.load(fp)

    def transform_pose(W2C, translate, scale):
        C2W = np.linalg.inv(W2C)
        cam_center = C2W[:3, 3]
        cam_center = (cam_center + translate) * scale
        C2W[:3, 3] = cam_center
        return np.linalg.inv(C2W)

    out_cam_dict = copy.deepcopy(in_cam_dict)
    for img_name in out_cam_dict:
        W2C = np.array(out_cam_dict[img_name]["W2C"]).reshape((4, 4))
        W2C = transform_pose(W2C, translate, scale)
        assert np.isclose(np.linalg.det(W2C[:3, :3]), 1.0)
        out_cam_dict[img_name]["W2C"] = list(W2C.flatten())

    with open(out_cam_dict_file, "w") as fp:
        json.dump(out_cam_dict, fp, indent=2, sort_keys=True)

if __name__ == "__main__":
    in_cam_dict_file = ""
    out_cam_dict_file = ""
    in_geometry_file = ""
    out_geometry_file = ""
    normalize_cam_dict(in_cam_dict_file, out_cam_dict_file, target_radius=0.8, in_geometry_file=in_geometry_file, out_geometry_file=out_geometry_file)
changfali commented 2 years ago

Hi, thanks for the script!! In my understanding, the normalization for both cameras and the point cloud is to put them into a unit sphere( Is this right?). This is the result of the normalization: image UMB0MEX5RO But as you can see, the positions of the cameras are not right: all the cameras are in front of the object. And this is the reason why I picked a front image and a back image for inspecting the epipolar geometry. The images which have non overlap in content should not have keypoints with others, but in my case it went wrong. Is this the mismatch of SIFT? Can you test the Xmen data from your side? Again, the version of my colmap is 3.8, build from source following: https://colmap.github.io/install.html on Ubuntu 20.04.

Kai-46 commented 2 years ago

Please allow me to ask a simple question: how did you manage to reconstruct the back of X-men (as shown in your point cloud) if all your cameras are in the front?

Btw, the script I shared only normalizes the point cloud such that the object is inside the unit sphere; there is no guarantee that the normalized cameras are also inside the unit sphere.

Bingrong89 commented 2 years ago

image Ignore the white camera cones, I am too lazy to delete them from the code so i set them to white. This is what I got from running the script Kai provided. 1) Delete the points that don't 'belong' to the model. The normalization calculation is based on the entire point cloud's furthest points forming a bounding box on the point cloud. 2) I set the scale numerator to 0.5 to fit the pointcloud inside the sphere.

Do the results look reasonable?

Kai-46 commented 2 years ago

This one looks more reasonable!

Btw, one side tip in case you don't know: you can visualize the bounding box in meshlab easily through "Render -> Show box corners". Showing the box also makes it easier to manually remove the extreme outliers in meshlab.

Bingrong89 commented 2 years ago

Glad to hear that! So maybe changfali can get the results hes looking for if he starts by cropping away all the noisy points in his pointcloud

Michaelwhite34 commented 2 years ago

Glad to hear that! So maybe changfali can get the results hes looking for if he starts by cropping away all the noisy points in his pointcloud

Please allow me to ask a dumb question, if I run colmap gui by myself and get the cameras,images,points3D file in both bin and txt format.What should be the input of the in_cam_dict_file = "" in the script ?

Bingrong89 commented 2 years ago

Glad to hear that! So maybe changfali can get the results hes looking for if he starts by cropping away all the noisy points in his pointcloud

Please allow me to ask a dumb question, if I run colmap gui by myself and get the cameras,images,points3D file in both bin and txt format.What should be the input of the in_cam_dict_file = "" in the script ?

You have to transform it into a json file with the coordinate convention they are using. That can be done using part of the function extract_all_to_dir, from https://github.com/Kai-46/nerfplusplus/blob/master/colmap_runner/extract_sfm.py. Seems to be from line 97 to line 99. Just a reminder this one gives unnormalized values. I hope this helps~

Michaelwhite34 commented 2 years ago

Glad to hear that! So maybe changfali can get the results hes looking for if he starts by cropping away all the noisy points in his pointcloud

Please allow me to ask a dumb question, if I run colmap gui by myself and get the cameras,images,points3D file in both bin and txt format.What should be the input of the in_cam_dict_file = "" in the script ?

You have to transform it into a json file with the coordinate convention they are using. That can be done using part of the function extract_all_to_dir, from https://github.com/Kai-46/nerfplusplus/blob/master/colmap_runner/extract_sfm.py. Seems to be from line 97 to line 99. Just a reminder this one gives unnormalized values. I hope this helps~

I tried to run this sparse_dir = 'E:\deep_learning_stuff\New_Folder\ngp5\instant-ngp\colmap_sparse\0' cameras, images, points3D = read_model(sparse_dir, ext) camera_dict = parse_camera_dict(cameras, images) with open(camera_dict_file, 'w') as fp: json.dump(camera_dict, fp, indent=2, sort_keys=True) Sorry I am not a code guy, can you write a complete script with input and output ? Thanks - -

Michaelwhite34 commented 2 years ago

Here it is. I hope the script is self-explanatory. Please keep in mind that the normalization must be applied to the point cloud and cameras at the same time (might worth thinking why this is the case from your side); and when you inspect the epipolar geometry, try picking pairs of images that overlap in content rather than a front image and a back image (might worth thinking about the logic too).

import numpy as np
import json
import copy
import open3d as o3d

def normalize_cam_dict(
    in_cam_dict_file, out_cam_dict_file, target_radius, in_geometry_file, out_geometry_file
):
    # estimate a translate and scale that centers the objects from the sparse point cloud
    #! note if your sparse point cloud contains extreme outliers, you should remove the outliers in meshlab first,
    #!    ohterwise, the estimated bounding box is going to be too big.
    pcd = o3d.io.read_point_cloud(in_geometry_file)
    box = pcd.get_oriented_bounding_box()
    box_corners = np.asarray(box.get_box_points())  # [8, 3]
    box_center = np.mean(box_corners, axis=0, keepdims=True)  # [1, 3]
    dist = np.linalg.norm(box_corners - box_center, axis=1, keepdims=True)  # [8, 1]
    diagonal = np.max(dist)
    translate = -box_center
    scale = target_radius / (diagonal / 2.)

    # apply translate and scale to the sparse point cloud
    tf_translate = np.eye(4)
    tf_translate[:3, 3:4] = translate
    tf_scale = np.eye(4)
    tf_scale[:3, :3] *= scale
    tf = np.matmul(tf_scale, tf_translate)

    pcd_norm = pcd.transform(tf)
    o3d.io.write_point_cloud(out_geometry_file, pcd_norm)

    # apply translate and scale to the cameras
    with open(in_cam_dict_file) as fp:
        in_cam_dict = json.load(fp)

    def transform_pose(W2C, translate, scale):
        C2W = np.linalg.inv(W2C)
        cam_center = C2W[:3, 3]
        cam_center = (cam_center + translate) * scale
        C2W[:3, 3] = cam_center
        return np.linalg.inv(C2W)

    out_cam_dict = copy.deepcopy(in_cam_dict)
    for img_name in out_cam_dict:
        W2C = np.array(out_cam_dict[img_name]["W2C"]).reshape((4, 4))
        W2C = transform_pose(W2C, translate, scale)
        assert np.isclose(np.linalg.det(W2C[:3, :3]), 1.0)
        out_cam_dict[img_name]["W2C"] = list(W2C.flatten())

    with open(out_cam_dict_file, "w") as fp:
        json.dump(out_cam_dict, fp, indent=2, sort_keys=True)

if __name__ == "__main__":
    in_cam_dict_file = ""
    out_cam_dict_file = ""
    normalize_cam_dict(in_cam_dict_file, out_cam_dict_file, target_radius=0.8)

Hi, can you provide a script to convert colmap gui output to the camera convention used here ?

Michaelwhite34 commented 2 years ago

After running run_colmap.py I get kai_cameras.json,kai_cameras_normalized.json,kai_keypoints.json,kai_points.ply.Then I open ply file in meshlab and remove points that don't belong to the model and save to replace the original ply. Then I run the script provided by kai, I get the following error. File "camera.py", line 63, in normalize_cam_dict(in_cam_dict_file, out_cam_dict_file, target_radius=0.8) TypeError: normalize_cam_dict() missing 2 required positional arguments: 'in_geometry_file' and 'out_geometry_file'

Bingrong89 commented 2 years ago

After running run_colmap.py I get kai_cameras.json,kai_cameras_normalized.json,kai_keypoints.json,kai_points.ply.Then I open ply file in meshlab and remove points that don't belong to the model and save to replace the original ply. Then I run the script provided by kai, I get the following error. File "camera.py", line 63, in normalize_cam_dict(in_cam_dict_file, out_cam_dict_file, target_radius=0.8) TypeError: normalize_cam_dict() missing 2 required positional arguments: 'in_geometry_file' and 'out_geometry_file'

in_geometry_file is the .ply file you edited. out_geometry_file is the name to save the output pointcloud file

Michaelwhite34 commented 2 years ago

After running run_colmap.py I get kai_cameras.json,kai_cameras_normalized.json,kai_keypoints.json,kai_points.ply.Then I open ply file in meshlab and remove points that don't belong to the model and save to replace the original ply. Then I run the script provided by kai, I get the following error. File "camera.py", line 63, in normalize_cam_dict(in_cam_dict_file, out_cam_dict_file, target_radius=0.8) TypeError: normalize_cam_dict() missing 2 required positional arguments: 'in_geometry_file' and 'out_geometry_file'

in_geometry_file is the .ply file you edited. out_geometry_file is the name to save the output pointcloud file

I understand, but I don't know what to do... I modified the last few lines:in_geometry_file = "/home/michael/rd12_out/posed_images/kai_points.ply" out_geometry_file = "/home/michael/rd12_out/posed_images/out.ply" in_cam_dict_file = "/home/michael/rd12_out/posed_images/kai_cameras.json" out_cam_dict_file = "/home/michael/rd12_out/posed_images/normalized.json" normalize_cam_dict(in_cam_dict_file, out_cam_dict_file, in_geometry_file, out_geometry_file, target_radius=0.8) And it gives error :Traceback (most recent call last): File "camera.py", line 63, in normalize_cam_dict(in_cam_dict_file, out_cam_dict_file, in_geometry_file, out_geometry_file, target_radius=0.8) TypeError: normalize_cam_dict() got multiple values for argument 'target_radius'

Michaelwhite34 commented 2 years ago

I tried to remove "target_radius" in last line and replace target_radius with 0.8, but then "File "camera.py", line 64, in normalize_cam_dict(in_cam_dict_file, out_cam_dict_file, in_geometry_file, out_geometry_file) TypeError: normalize_cam_dict() missing 1 required positional argument: 'out_geometry_file' " It must be some stupid mistake - -

Bingrong89 commented 2 years ago

run the function like this : normalize_cam_dict(in_cam_dict_file, out_cam_dict_file,0.8, in_geometry_file, out_geometry_file)

Michaelwhite34 commented 2 years ago

run the function like this : normalize_cam_dict(in_cam_dict_file, out_cam_dict_file,0.8, in_geometry_file, out_geometry_file)

Reading PLY: [========================================] 100% Traceback (most recent call last): File "camera.py", line 64, in normalize_cam_dict(in_cam_dict_file, out_cam_dict_file, 0.8,in_geometry_file, out_geometry_file) File "camera.py", line 18, in normalize_cam_dict box = pcd.get_oriented_bounding_box() AttributeError: 'open3d.open3d.geometry.PointCloud' object has no attribute 'get_oriented_bounding_box'

Seems like something wrong with open3d, I installed it by pip install open3d-python.

Michaelwhite34 commented 2 years ago

Now I install open3d by pip install open3d, now it gives "Traceback (most recent call last): File "camera.py", line 64, in normalize_cam_dict(in_cam_dict_file, out_cam_dict_file, 0.8,in_geometry_file, out_geometry_file) File "camera.py", line 29, in normalize_cam_dict tf_translate[:3, 3:4] = translate ValueError: could not broadcast input array from shape (1,3) into shape (3,1) " I have tried installing open3d in many ways, they all lead to the same error.

Michaelwhite34 commented 2 years ago

run the function like this : normalize_cam_dict(in_cam_dict_file, out_cam_dict_file,0.8, in_geometry_file, out_geometry_file)

Sorry to bother you again, do you know how to solve the issue above ?

Kai-46 commented 1 year ago

Hi @Michaelwhite34, sorry that there were some bugs in the code I shared earlier. I just created a self-contained camera normalization demo with examples included; you can simply use the normalize_then_visualize_camera.py to convert COLMAP outputs to JSON files: https://www.icloud.com/iclouddrive/00f6DcB-NJNHH16r3o5BVcovg#camera_demo

Michaelwhite34 commented 8 months ago

Hi @Michaelwhite34, sorry that there were some bugs in the code I shared earlier. I just created a self-contained camera normalization demo with examples included; you can simply use the normalize_then_visualize_camera.py to convert COLMAP outputs to JSON files: https://www.icloud.com/iclouddrive/00f6DcB-NJNHH16r3o5BVcovg#camera_demo

Can the scripts run on windows ? First I run colmap gui to get a folder sparse/0 with 3 bin and 1 ini. And then run normalize_then_visualize_cameras.py to get "'Please manually crop your sparse point cloud in meshlab to remove outliers!'". I know I should have run run_colmap.py first instead of using Colmap gui, but that would make error on windows and I deleted my ubuntu long time ago.

Michaelwhite34 commented 8 months ago

(open3d) D:\nerfplusplus\colmap_runner>python run_colmap.py Running sift matching...

Running cmd: D:/COLMAP-3.9.1-windows-cuda/COLMAP-3.9.1-windows-cuda/bin/colmap feature_extractor --database_path D:\New5\sfm\database.db --image_path D:/New_Folder4/ --ImageReader.single_camera 1 --ImageReader.camera_model SIMPLE_RADIAL --SiftExtraction.max_image_size 5000 --SiftExtraction.estimate_affine_shape 0 --SiftExtraction.domain_size_pooling 1 --SiftExtraction.use_gpu 1 --SiftExtraction.max_num_features 16384 --SiftExtraction.gpu_index -1 Traceback (most recent call last): File "run_colmap.py", line 167, in main(img_dir, out_dir, run_mvs=run_mvs) File "run_colmap.py", line 128, in main run_sift_matching(img_dir, db_file, remove_exist=False) File "run_colmap.py", line 39, in run_sift_matching bash_run(cmd) File "run_colmap.py", line 15, in bash_run subprocess.check_call(['/bin/bash', '-c', cmd]) File "D:\anaconda3\envs\open3d\lib\subprocess.py", line 359, in check_call retcode = call(*popenargs, *kwargs) File "D:\anaconda3\envs\open3d\lib\subprocess.py", line 340, in call with Popen(popenargs, **kwargs) as p: File "D:\anaconda3\envs\open3d\lib\subprocess.py", line 858, in init self._execute_child(args, executable, preexec_fn, close_fds, File "D:\anaconda3\envs\open3d\lib\subprocess.py", line 1327, in _execute_child hp, ht, pid, tid = _winapi.CreateProcess(executable, args, FileNotFoundError: [WinError 2] The system cannot find the file specified

Michaelwhite34 commented 8 months ago

Finally I figured it out, but there is still error in the mesh generation phase. global_step: 50000 loss.item(): 0.17388814687728882 img_loss.item(): 0.17244312167167664 img_l2_loss.item(): 0.08620146661996841 img_ssim_loss.item(): 0.08624166250228882 eik_loss.item(): 0.0014450259041041136 roughrange_loss.item(): 0.0 color_network_dict["point_light_network"].get_light().item(): 1.240997076034546 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50001/50001 [2:19:22<00:00, 5.98it/s] ic| f"Exporting mesh and materials to: {export_out_dir}": ('Exporting mesh and materials to: ' './exp_iron_stage2/rabbit/mesh_and_materials_50000') ic| 'Exporting mesh and uv...' Traceback (most recent call last): File "render_surface.py", line 549, in export_mesh_and_materials(export_out_dir, sdf_network, color_network_dict) File "render_surface.py", line 325, in export_mesh_and_materials export_mesh(sdf_fn, os.path.join(export_out_dir, "mesh.obj")) File "/workspace/iron/models/export_mesh.py", line 73, in export_mesh areas = np.array([c.area for c in components], dtype=np.float) File "/root/anaconda3/envs/iron/lib/python3.8/site-packages/numpy/init.py", line 305, in getattr raise AttributeError(__former_attrs__[attr]) AttributeError: module 'numpy' has no attribute 'float'. np.float was a deprecated alias for the builtin float. To avoid this error in existing code, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here. The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations