autonomousvision / sdfstudio

A Unified Framework for Surface Reconstruction
Apache License 2.0
1.99k stars 185 forks source link

Providing your own data. #2

Open LownyCGI opened 1 year ago

LownyCGI commented 1 year ago

Guys i don't see in your documentation how i can provide my own dataset.

Is it the same as nerfstudio ns-process-data? I see you have a mask data in your dataset. How do i apply that mask for training?

pablovela5620 commented 1 year ago

Looks like they built a new data parser that isn't compatible with the ns-process-data. https://github.com/autonomousvision/sdfstudio/blob/master/docs/sdfstudio-data.md#Customize-your-own-dataset

You'd probably need to write a conversion script similar to what they provide to use something like the already implemented colmap/polycam methods with ns-process-data

pablovela5620 commented 1 year ago

Speaking of, I would love to see something like this https://github.com/autonomousvision/sdfstudio/blob/master/scripts/datasets/process_scannet_to_sdfstudio.py

For the great tools nerfstudio already provides via the colmap/record3d/polycam interfaces. It looks like things just need to be converted to the meta_data.json file which is really similar to the transformation.json file provided by nerfstudio

niujinshuchong commented 1 year ago

@LownyCGI @pablovela5620 Currently the foreground mask is only used in the heritage dataset. We plan to add mask support (including masks for pixels sampling used in training and foreground masks used in the loss computation) to sdfstudio data sparser soon.

All other data formats supported in nerfstudio could be used in sdfstudio naturally but it's not been tested yet. For example, to use blender dataset, you could run something like:

ns-train neus-facto --pipeline.model.near-plane 2.0 --pipeline.model.far-plane 6.0 --pipeline.model.overwrite-near-far-plane True blender-data --data data/blender/lego/

Note you may also need to change some default configs to make it work though.

pablovela5620 commented 1 year ago

@niujinshuchong So I've been trying to get things working with the output from the colmap tool that nerfstudio provides ns-process-data images etc... and I've been facing the following issues.

First I couldn't get things working from outright using the outputs from ns-process-data

It seems like there's a difference in the coordinate systems that DTU (and all the other sdfstudio datasets) and the original nerfstudio datasets (Which uses Blender/OpenGL).

Here's what I mean. When I train on scan65 of the DTU dataset with the command shown in the readme and view things on the web viewer it looks like this sdfstudio-dtu

Here on the other hand is what I get when I use the output from ns-process-data images in the original Nerfstudio repo (its got a bunch of floaters but the output cameras seem to be oriented correctly) nerfstudio-dtu

Because I want to use geometric priors for supervision I wrote this script (based on the scannet-to-sdfstudio.py script) and it I got results that don't really makes sense. Here's the script

import argparse
import glob
import json
import os
from pathlib import Path

import cv2
import numpy as np
import PIL
from PIL import Image
from torchvision import transforms

def main():
    parser = argparse.ArgumentParser(description="preprocess scannet dataset to sdfstudio dataset")

    parser.add_argument("--data", dest="input_path", help="path to scannet scene")
    parser.set_defaults(im_name="NONE")

    parser.add_argument("--output-dir", dest="output_path", help="path to output")
    parser.set_defaults(store_name="NONE")

    args = parser.parse_args()

    output_path = Path(args.output_path)  # "data/custom/scannet_scene0050_00"
    input_path = Path(args.input_path)  # "/home/yuzh/Projects/datasets/scannet/scene0050_00"

    output_path.mkdir(parents=True, exist_ok=True)

    # load transformation json with images/intrinsics/extrinsics
    camera_parameters_path = input_path / "transforms.json"
    camera_parameters = json.load(open(camera_parameters_path))

    # extract intrinsic parameters
    cx = camera_parameters["cx"]
    cy = camera_parameters["cy"]
    fl_x = camera_parameters["fl_x"]
    fl_y = camera_parameters["fl_y"]

    camera_intrinsic = np.array([[fl_x, 0, cx], [0, fl_y, cy], [0, 0, 1]])

    # load poses
    poses = []
    image_paths = []
    # only load images with corresponding pose info
    # currently in random order??, probably need to sort
    for camera in camera_parameters["frames"]:
        # OpenGL/Blender convention, needs to change to COLMAP/OpenCV convention
        # https://docs.nerf.studio/en/latest/quickstart/data_conventions.html
        ## IGNORED for now
        c2w = np.array(camera["transform_matrix"]).reshape(4, 4)
        c2w[0:3, 1:3] *= -1

        img_path = input_path / camera["file_path"]
        assert img_path.exists()
        image_paths.append(img_path)
        poses.append(c2w)

    poses = np.array(poses)

    # deal with invalid poses
    valid_poses = np.isfinite(poses).all(axis=2).all(axis=1)
    min_vertices = poses[:, :3, 3][valid_poses].min(axis=0)
    max_vertices = poses[:, :3, 3][valid_poses].max(axis=0)

    center = (min_vertices + max_vertices) / 2.0
    scale = 2.0 / (np.max(max_vertices - min_vertices) + 3.0)

    # we should normalize pose to unit cube
    poses[:, :3, 3] -= center
    poses[:, :3, 3] *= scale

    # inverse normalization
    scale_mat = np.eye(4).astype(np.float32)
    scale_mat[:3, 3] -= center
    scale_mat[:3] *= scale
    scale_mat = np.linalg.inv(scale_mat)

    # copy image
    sample_img = cv2.imread(str(image_paths[0]))
    H, W, _ = sample_img.shape  # 1080 x 1920

    # get smallest side to generate square crop
    target_crop = min(H, W)

    target_size = 384
    trans_totensor = transforms.Compose(
        [
            transforms.CenterCrop(target_crop),
            transforms.Resize(target_size, interpolation=PIL.Image.BILINEAR),
        ]
    )

    # center crop by min_dim
    offset_x = (W - target_crop) * 0.5
    offset_y = (H - target_crop) * 0.5

    camera_intrinsic[0, 2] -= offset_x
    camera_intrinsic[1, 2] -= offset_y
    # resize from min_dim x min_dim -> to 384 x 384
    resize_factor = target_size / target_crop
    camera_intrinsic[:2, :] *= resize_factor

    K = camera_intrinsic

    frames = []
    out_index = 0
    for idx, (valid, pose, image_path) in enumerate(zip(valid_poses, poses, image_paths)):
        if not valid:
            continue

        target_image = output_path / f"{out_index:06d}_rgb.png"
        img = Image.open(image_path)
        img_tensor = trans_totensor(img)
        img_tensor.save(target_image)

        rgb_path = str(target_image.relative_to(output_path))
        frame = {
            "rgb_path": rgb_path,
            "camtoworld": pose.tolist(),
            "intrinsics": K.tolist(),
            "mono_depth_path": rgb_path.replace("_rgb.png", "_depth.npy"),
            "mono_normal_path": rgb_path.replace("_rgb.png", "_normal.npy"),
        }

        frames.append(frame)
        out_index += 1

    # scene bbox for the scannet scene
    scene_box = {
        "aabb": [[-1, -1, -1], [1, 1, 1]],
        "near": 0.05,
        "far": 2.5,
        "radius": 1.0,
        "collider_type": "box",
    }

    # meta data
    output_data = {
        "camera_model": "OPENCV",
        "height": target_size,
        "width": target_size,
        "has_mono_prior": True,
        "pairs": None,
        "worldtogt": scale_mat.tolist(),
        "scene_box": scene_box,
    }

    output_data["frames"] = frames

    # save as json
    with open(output_path / "meta_data.json", "w", encoding="utf-8") as f:
        json.dump(output_data, f, indent=4)

if __name__ == "__main__":
    main()

Lastly here it is after processing the colmap output from nerfstudio through the above script. sdfstudio-colmap-dtu I can provide the dataset/outputs if that would be helpful. Let me know, this repo is extremely helpful but one of the things that made nerfstudio so great is being able to use custom videos/images. I hope to contribute this conversion script such that it can be done with SDFStudio as well

pablovela5620 commented 1 year ago

Heres the render showing the colmap -> sdf version output from the above script dtu-rendering-colmap

niujinshuchong commented 1 year ago

@pablovela5620 Thanks for sharing your results.

For your first video, you can add --auto-orient True at the end of your training command to make the up vector the same as the viewer.

How do you get your second video? Which method and config do you use for the second video?

Yes, sharing your output from colmap and nerfstudio dataset converted from colmap would be very helpful. I think maybe the normalization is not suited here because what you used is what we heuristically chose for scannet dataset where all cameras are normalized to inside the room.

pablovela5620 commented 1 year ago

For the second video, I used the output from ns-process-data images --data data/sdfstudio-demo-data/dtu-scan65/image/ with the original nerfstudio repo ns-train nerfacto. I made no modifications at all to the outputs from colmap. (I tried running it on sdfstudio but would get the following error)

  File "/home/pablo/miniconda3/envs/sdfstudio/bin/ns-train", line 8, in <module>
    sys.exit(entrypoint())
  File "/home/pablo/0Dev/repos/sdfstudio/scripts/train.py", line 248, in entrypoint
    main(
  File "/home/pablo/0Dev/repos/sdfstudio/scripts/train.py", line 234, in main
    launch(
  File "/home/pablo/0Dev/repos/sdfstudio/scripts/train.py", line 173, in launch
    main_func(local_rank=0, world_size=world_size, config=config)
  File "/home/pablo/0Dev/repos/sdfstudio/scripts/train.py", line 87, in train_loop
    trainer.setup()
  File "/home/pablo/0Dev/repos/sdfstudio/nerfstudio/engine/trainer.py", line 115, in setup
    self.pipeline = self.config.pipeline.setup(
  File "/home/pablo/0Dev/repos/sdfstudio/nerfstudio/configs/base_config.py", line 66, in setup
    return self._target(self, **kwargs)
  File "/home/pablo/0Dev/repos/sdfstudio/nerfstudio/pipelines/base_pipeline.py", line 224, in __init__
    self.datamanager: VanillaDataManager = config.datamanager.setup(
  File "/home/pablo/0Dev/repos/sdfstudio/nerfstudio/configs/base_config.py", line 66, in setup
    return self._target(self, **kwargs)
  File "/home/pablo/0Dev/repos/sdfstudio/nerfstudio/data/datamanagers/base_datamanager.py", line 321, in __init__
    self.train_dataset = self.create_train_dataset()
  File "/home/pablo/0Dev/repos/sdfstudio/nerfstudio/data/datamanagers/base_datamanager.py", line 328, in create_train_dataset
    dataparser_outputs=self.dataparser.get_dataparser_outputs(split="train"),
  File "/home/pablo/0Dev/repos/sdfstudio/nerfstudio/data/dataparsers/base_dataparser.py", line 127, in get_dataparser_outputs
    dataparser_outputs = self._generate_dataparser_outputs(split)
  File "/home/pablo/0Dev/repos/sdfstudio/nerfstudio/data/dataparsers/nerfstudio_dataparser.py", line 193, in _generate_dataparser_outputs
    scale_factor /= torch.max(torch.abs(poses[:, :3, 3]))
TypeError: tuple indices must be integers or slices, not tuple

Heres a link to the dataset I used. https://drive.google.com/drive/folders/1tFvZQXohO4oXj1y8uJGdjqbIVZkF4zeK?usp=sharing

I just used the included colmap install to generate the poses for the provided demo DTU dataset. running ns-process-data images --data data/sdfstudio-demo-data/dtu-scan65/image/ --output-dir $CHOSEN-DIR scan65 is the output from this and scan65-processed is the output from parsing this data using the python script I shared above. Thanks for the help on this!

One last thing that I noticed is that https://github.com/autonomousvision/sdfstudio/blob/e52c37fe0840f1f3d339c79132dfd2adbd11b6f8/nerfstudio/data/dataparsers/sdfstudio_dataparser.py#L189

assumes that the input data has not already been preprocessed to be in the data format of Nerfstudio (blender/opengl), where as when using ns-process-data this is already done which would cause some issues (I tried changing it back but that didn't seem to fix the issue)

-- EDIT -- So I managed to figure out what the above error is, for some reason this line is missing the transformation output and its being merged together into a list. You can see how its there in the original nerfstudio repo. I'm guessing it has to do with the modified DataManager? Anyway, this makes it so running ns-train nerfacto nerfstudio-data --data $COLMAP-DATA runs just fine, but switching to ns-train neus-facto --pipeline.model.sdf-field.inside-outside False sdfstudio-data --data $COLMAP-DATA produces garbage results. This feels like there's just something wrong with the format of the poses or how their read. More info or consistency between the different datasets/managers/parsers would be appreciated

niujinshuchong commented 1 year ago

@pablovela5620 Thanks for sharing the data.

After normalization with

center = (min_vertices + max_vertices) / 2.0
scale = 2.0 / (np.max(max_vertices - min_vertices) + 3.0)

The object is not centered at the origin. And in this case, only a very small part of the unit cube are occupied by the object. I think the reason is that multi-res grids are sensitive to initialization in this case because we initialize the SDF as a sphere centered at the origin and the feature grids can overfit to color very quickly even the geometry is bad.

I tried with with pure MLP

--pipeline.model.sdf-field.inside-outside False --pipeline.model.sdf-field.use-grid-feature False

and it can get reasonable results.

pablovela5620 commented 1 year ago

This was super helpful! I'll make a PR for this processing script then, I do have a question which is, shouldn't the normalization center the object? or is this an issue with COLMAP, or with the way DTU was collected? I'll do some more testing with some of the provided datasets nerfstudio includes to see if I can get them working with the hashgrid. Thanks again for the help and clearing this up. I tested with pure MLP and it seemed to work!

I'm noticing the normalization sets the cameras as the origin

image

and because its only from a front-facing viewpoint (and the object is far away) it leads to the state you're describing. I'm guessing a scene where we get the object in the front side and back would not cause this problem

I did have one last question, I was under the assumption that hashgrid encoding was the fastest version, but now trying with pure MLP its around 10ms. Is there something I'm missing here or was it a faulty assumption?

niujinshuchong commented 1 year ago

@pablovela5620 Hi, ideally we want to center the object but the camera center is not aligned with the object center in the DTU dataset. Maybe a better way is computing the scene center with the sparse point cloud from colmap.

Yes, hashgrid converges faster but it still needs a MLP to predict SDF and color. In the pure MLP case here, it is faster per-iteration because we skip the grid features but it converges much slower and we need much more iterations.

pablovela5620 commented 1 year ago

Okay so I'm still having trouble getting my own data to work. Here's what I'm trying and the results I'm getting. I've included the processed data for nerfstudio (not the one processed by the above script as its large and taking too long to upload) to see if I could get some help.

https://drive.google.com/drive/folders/1HPm0U2wL9PtWovlV_pFe-P9_WJWnbLTT?usp=sharing

The weird bit is that the processed that works just fine in the original MonoSDF repo

  1. I created a dataset using the polycam app and converted it to the nerfstudio-data format (very similar to the output from colmap). I did so doing the following command ns-process-data polycam --data $PATH_TO_ZIP --output-dir $PATH_TO_OUTPUT This generates the data which I use to train

    • A nerfacto model
    • A neusfacto model with geometric priors
    • a monosdf model with geometric priors
    • the original monosdf model with geometric priors
  2. Here's the script to generate the data needed for the original monosdf repo looks like (its almost exactly the same as the one I posted https://github.com/autonomousvision/sdfstudio/issues/2#issuecomment-1358541961 except instead of a .json its a camera.npz file)

import numpy as np
import cv2
import os
import json
import argparse
import PIL
from PIL import Image
from torchvision import transforms
import matplotlib.pyplot as plt
from pathlib import Path

POLYCAM = True
target_crop = 720
target_size = 384
trans_totensor = transforms.Compose([
    transforms.CenterCrop(target_crop),
    transforms.Resize(target_size, interpolation=PIL.Image.BILINEAR),
])

def argument_parser():
    parser = argparse.ArgumentParser(description='Visualize output for depth or surface normals')

    parser.add_argument('--data', help="path to processed colmap data from nerfstudio")
    parser.set_defaults(im_name='NONE')

    parser.add_argument('--output-dir', help="path to where output image should be stored")
    parser.set_defaults(store_name='NONE')

    args = parser.parse_args()
    return args

def convert():
    args = argument_parser()
    data_root = Path(args.data)
    out_path_prefix = Path(args.output_dir)
    assert data_root.exists()

    # load intrinsic/extrinsics
    camera_parameters_path = data_root / "transforms.json"
    camera_parameters = json.load(open(camera_parameters_path))

    if not POLYCAM:
        cx = camera_parameters['cx']
        cy = camera_parameters['cy']
        fl_x = camera_parameters['fl_x']
        fl_y = camera_parameters['fl_y']
    else:
        cx = 0
        cy = 0
        fl_x = 0
        fl_y = 0

    camera_parameters = camera_parameters['frames']
    num_frames = len(camera_parameters)

    poses = []
    image_paths = []
    for camera in camera_parameters:
        if POLYCAM:
            # average frames into single intrinsic
            cx += camera['cx']
            cy += camera['cy']
            fl_x += camera['fl_x']
            fl_y += camera['fl_y']

        # OpenGL/Blender convention, needs to change to COLMAP/OpenCV convention
        # https://docs.nerf.studio/en/latest/quickstart/data_conventions.html
        ## IGNORED for now
        c2w = np.array(camera["transform_matrix"]).reshape(4, 4)
        c2w[0:3, 1:3] *= -1

        img_path = data_root / camera["file_path"]
        assert img_path.exists()
        image_paths.append(img_path)
        poses.append(c2w)

    if POLYCAM:
        # intrinsics
        cx /= num_frames
        cy /= num_frames
        fl_x /= num_frames
        fl_y /= num_frames

    camera_intrinsic = np.array(
        [[fl_x, 0, cx],
        [0, fl_y, cy],
        [0, 0, 1]]
    )

    poses = np.stack(poses)

    # deal with invalid poses
    valid_poses = np.isfinite(poses).all(axis=2).all(axis=1)
    min_vertices = poses[:, :3, 3][valid_poses].min(axis=0)
    max_vertices = poses[:, :3, 3][valid_poses].max(axis=0)

    center = (min_vertices + max_vertices) / 2.
    scale = 2. / (np.max(max_vertices - min_vertices) + 3.)

    # we should normalized to unit cube
    scale_mat = np.eye(4).astype(np.float32)
    scale_mat[:3, 3] = -center
    scale_mat[:3 ] *= scale 
    scale_mat = np.linalg.inv(scale_mat)

    # copy image
    out_index = 0
    cameras = {}
    pcds = []

    # H, W = 738, 994
    H, W, _ = cv2.imread(str(image_paths[0])).shape
    # center crop by 738
    offset_x = (W - target_crop) * 0.5
    offset_y = (H - target_crop) * 0.5
    camera_intrinsic[0, 2] -= offset_x
    camera_intrinsic[1, 2] -= offset_y
    # resize, from 738x738 -> 384x384
    resize_factor = target_size / target_crop
    camera_intrinsic[:2, :] *= resize_factor

    K = np.eye(4)
    K[:3, :3] = camera_intrinsic

    for idx, (valid, pose, image_path) in enumerate(zip(valid_poses, poses, image_paths)):
        print(idx, valid)
        if not valid : continue

        target_image = out_path_prefix / "image" / f"{out_index:06d}.png"
        target_image.parent.mkdir(parents=True, exist_ok=True)
        img = Image.open(image_path)
        img_tensor = trans_totensor(img)
        img_tensor.save(target_image)

        # masks aren't used
        mask = (np.ones((target_size, target_size, 3)) * 255.).astype(np.uint8)

        target_image = str(out_path_prefix / "mask" / f"{out_index:03d}.png")
        cv2.imwrite(target_image, mask)

        # save pose
        pcds.append(pose[:3, 3])
        pose = K @ np.linalg.inv(pose)

        cameras["scale_mat_%d"%(out_index)] = scale_mat
        cameras["world_mat_%d"%(out_index)] = pose

        out_index += 1

    np.savez(str(out_path_prefix / "cameras.npz"), **cameras)
    os.system(f'python preprocess/extract_monocular_cues.py --img_path {out_path_prefix / "image"} --output_path {out_path_prefix} --task normal')
    os.system(f'python preprocess/extract_monocular_cues.py --img_path {out_path_prefix / "image"} --output_path {out_path_prefix} --task depth')

if __name__ == "__main__":
    convert()

Given this, I get fairly good reconstruction results as you can see here example_monosdf_output

  1. Here is the output from nerfacto after training on the polycam data (and you can see the cameras are all fit within the bounding box) polycam-nerfacto

  2. Here is the output from using the example command for monosdf with the processed data

    ns-train monosdf --pipeline.model.sdf-field.use-grid-feature True --pipeline.model.sdf-field.hidden-dim 256 --pipeline.model.sdf-field.num-layers 2 --pipeline.model.sdf-field.num-layers-color 2 --pipeline.model.sdf-field.use-appearance-embedding True --pipeline.model.sdf-field.geometric-init True --pipeline.model.sdf-field.inside-outside True  --pipeline.model.sdf-field.bias 0.8 --pipeline.model.sdf-field.beta-init 0.1 --pipeline.datamanager.train-num-images-to-sample-from 1 --pipeline.datamanager.train-num-times-to-repeat-images 0 --trainer.steps-per-eval-image 5000 --pipeline.model.background-model none --vis wandb --experiment-name polycam-conference-room --pipeline.model.mono-depth-loss-mult 0.001 --pipeline.model.mono-normal-loss-mult 0.01 --pipeline.datamanager.train-num-rays-per-batch 2048 --machine.num-gpus 1 sdfstudio-data --data $PATH_TO_DATA --include_mono_prior True --skip_every_for_val_split 30

    here's what it looks like in the viewer, and here's the outputs from using wandb. As you can see there's something wrong.

image

You can also see here in the viewer that the scene still fits in the bounding box (but seems to have shrunken relative to what the ns-process-data provides. I'm not sure if that's because of what the auto_orient_and_center_poses function transform provides

I don't know if there's some bug, or something I'm missing.

niujinshuchong commented 1 year ago

@pablovela5620 Hi, thanks for sharing your data and results.

I ran your data with the same command you provided and got something like this (only 5K iterations). It looks like it's not completely failed, as shown in your screenshot (200K in your case). Or does it look good at the beginning and fail in the end?

image

The training loss is:

image image

The script I modified to generate the data is

import argparse
import glob
import json
import os
from pathlib import Path

import cv2
import numpy as np
import PIL
from PIL import Image
from torchvision import transforms

POLYCAM = True

def main():
    parser = argparse.ArgumentParser(description="preprocess scannet dataset to sdfstudio dataset")

    parser.add_argument("--data", dest="input_path", help="path to scannet scene")
    parser.set_defaults(im_name="NONE")

    parser.add_argument("--output-dir", dest="output_path", help="path to output")
    parser.set_defaults(store_name="NONE")

    args = parser.parse_args()

    output_path = Path(args.output_path)  # "data/custom/scannet_scene0050_00"
    input_path = Path(args.input_path)  # "/home/yuzh/Projects/datasets/scannet/scene0050_00"

    output_path.mkdir(parents=True, exist_ok=True)

    # load transformation json with images/intrinsics/extrinsics
    camera_parameters_path = input_path / "transforms.json"
    camera_parameters = json.load(open(camera_parameters_path))

    # extract intrinsic parameters
    if not POLYCAM:
        cx = camera_parameters["cx"]
        cy = camera_parameters["cy"]
        fl_x = camera_parameters["fl_x"]
        fl_y = camera_parameters["fl_y"]
    else:
        cx = 0
        cy = 0
        fl_x = 0
        fl_y = 0

    camera_parameters = camera_parameters["frames"]
    num_frames = len(camera_parameters)

    camera_intrinsic = np.array([[fl_x, 0, cx], [0, fl_y, cy], [0, 0, 1]])

    # load poses
    poses = []
    image_paths = []
    # only load images with corresponding pose info
    # currently in random order??, probably need to sort
    for camera in camera_parameters:
        if POLYCAM:
            # average frames into single intrinsic
            cx += camera["cx"]
            cy += camera["cy"]
            fl_x += camera["fl_x"]
            fl_y += camera["fl_y"]

        # OpenGL/Blender convention, needs to change to COLMAP/OpenCV convention
        # https://docs.nerf.studio/en/latest/quickstart/data_conventions.html
        ## IGNORED for now
        c2w = np.array(camera["transform_matrix"]).reshape(4, 4)
        c2w[0:3, 1:3] *= -1

        img_path = input_path / camera["file_path"]
        assert img_path.exists()
        image_paths.append(img_path)
        poses.append(c2w)

    poses = np.array(poses)

    if POLYCAM:
        # intrinsics
        cx /= num_frames
        cy /= num_frames
        fl_x /= num_frames
        fl_y /= num_frames

    camera_intrinsic = np.array([[fl_x, 0, cx], [0, fl_y, cy], [0, 0, 1]])

    # deal with invalid poses
    valid_poses = np.isfinite(poses).all(axis=2).all(axis=1)
    min_vertices = poses[:, :3, 3][valid_poses].min(axis=0)
    max_vertices = poses[:, :3, 3][valid_poses].max(axis=0)

    center = (min_vertices + max_vertices) / 2.0
    scale = 2.0 / (np.max(max_vertices - min_vertices) + 3.0)

    # we should normalize pose to unit cube
    poses[:, :3, 3] -= center
    poses[:, :3, 3] *= scale

    # inverse normalization
    scale_mat = np.eye(4).astype(np.float32)
    scale_mat[:3, 3] -= center
    scale_mat[:3] *= scale
    scale_mat = np.linalg.inv(scale_mat)

    # copy image
    sample_img = cv2.imread(str(image_paths[0]))
    H, W, _ = sample_img.shape  # 1080 x 1920

    # get smallest side to generate square crop
    target_crop = min(H, W)

    target_size = 384
    trans_totensor = transforms.Compose(
        [
            transforms.CenterCrop(target_crop),
            transforms.Resize(target_size, interpolation=PIL.Image.BILINEAR),
        ]
    )

    # center crop by min_dim
    offset_x = (W - target_crop) * 0.5
    offset_y = (H - target_crop) * 0.5

    camera_intrinsic[0, 2] -= offset_x
    camera_intrinsic[1, 2] -= offset_y
    # resize from min_dim x min_dim -> to 384 x 384
    resize_factor = target_size / target_crop
    camera_intrinsic[:2, :] *= resize_factor

    K = camera_intrinsic

    frames = []
    out_index = 0
    for idx, (valid, pose, image_path) in enumerate(zip(valid_poses, poses, image_paths)):
        if not valid:
            continue

        target_image = output_path / f"{out_index:06d}_rgb.png"
        img = Image.open(image_path)
        img_tensor = trans_totensor(img)
        img_tensor.save(target_image)

        rgb_path = str(target_image.relative_to(output_path))
        frame = {
            "rgb_path": rgb_path,
            "camtoworld": pose.tolist(),
            "intrinsics": K.tolist(),
            "mono_depth_path": rgb_path.replace("_rgb.png", "_depth.npy"),
            "mono_normal_path": rgb_path.replace("_rgb.png", "_normal.npy"),
        }

        frames.append(frame)
        out_index += 1

    # scene bbox for the scannet scene
    scene_box = {
        "aabb": [[-1, -1, -1], [1, 1, 1]],
        "near": 0.05,
        "far": 2.5,
        "radius": 1.0,
        "collider_type": "box",
    }

    # meta data
    output_data = {
        "camera_model": "OPENCV",
        "height": target_size,
        "width": target_size,
        "has_mono_prior": True,
        "pairs": None,
        "worldtogt": scale_mat.tolist(),
        "scene_box": scene_box,
    }

    output_data["frames"] = frames

    # save as json
    with open(output_path / "meta_data.json", "w", encoding="utf-8") as f:
        json.dump(output_data, f, indent=4)

if __name__ == "__main__":
    main()
pablovela5620 commented 1 year ago

Hmmm this is what it looks like for me at 5k iterations

image

I'll take a look at the modifications you made to see if its something there. I'll try running it again.

image
niujinshuchong commented 1 year ago

@pablovela5620 This is what I got with large weighs for monocular prior

--pipeline.model.mono-depth-loss-mult 0.1 --pipeline.model.mono-normal-loss-mult 0.05
image
pablovela5620 commented 1 year ago

I have no idea what changed, I must have set a parameter wrong at some point but now it seems to be working for me with this custom data!

image
pablovela5620 commented 1 year ago

I will add that your results look much more clean then mine do at iteration 30k and I'm not fully sure why

image

I'm guessing is the larger prior losses?

Would you mind posting your config.yml file from the output? here's mine

!!python/object:nerfstudio.configs.base_config.Config
data: null
experiment_name: polycam-conference-room
logging: !!python/object:nerfstudio.configs.base_config.LoggingConfig
  enable_profiler: true
  local_writer: !!python/object:nerfstudio.configs.base_config.LocalWriterConfig
    _target: !!python/name:nerfstudio.utils.writer.LocalWriter ''
    enable: true
    max_log_size: 10
    stats_to_track: !!python/tuple
    - !!python/object/apply:nerfstudio.utils.writer.EventName
      - Train Iter (time)
    - !!python/object/apply:nerfstudio.utils.writer.EventName
      - Train Rays / Sec
    - !!python/object/apply:nerfstudio.utils.writer.EventName
      - Test PSNR
    - !!python/object/apply:nerfstudio.utils.writer.EventName
      - Vis Rays / Sec
    - !!python/object/apply:nerfstudio.utils.writer.EventName
      - Test Rays / Sec
  max_buffer_size: 20
  relative_log_dir: !!python/object/apply:pathlib.PosixPath []
  steps_per_log: 10
machine: !!python/object:nerfstudio.configs.base_config.MachineConfig
  dist_url: auto
  machine_rank: 0
  num_gpus: 1
  num_machines: 1
  seed: 42
method_name: monosdf
optimizers:
  field_background:
    optimizer: !!python/object:nerfstudio.engine.optimizers.AdamOptimizerConfig
      _target: &id001 !!python/name:torch.optim.adam.Adam ''
      eps: 1.0e-15
      lr: 0.0005
      weight_decay: 0
    scheduler: !!python/object:nerfstudio.engine.schedulers.ExponentialSchedulerConfig
      _target: &id002 !!python/name:torch.optim.lr_scheduler.ExponentialLR ''
      decay_rate: 0.1
      max_steps: 200000
  fields:
    optimizer: !!python/object:nerfstudio.engine.optimizers.AdamOptimizerConfig
      _target: *id001
      eps: 1.0e-15
      lr: 0.0005
      weight_decay: 0
    scheduler: !!python/object:nerfstudio.engine.schedulers.ExponentialSchedulerConfig
      _target: *id002
      decay_rate: 0.1
      max_steps: 200000
output_dir: !!python/object/apply:pathlib.PosixPath
- outputs
pipeline: !!python/object:nerfstudio.pipelines.base_pipeline.VanillaPipelineConfig
  _target: !!python/name:nerfstudio.pipelines.base_pipeline.VanillaPipeline ''
  datamanager: !!python/object:nerfstudio.data.datamanagers.base_datamanager.VanillaDataManagerConfig
    _target: !!python/name:nerfstudio.data.datamanagers.base_datamanager.VanillaDataManager ''
    camera_optimizer: !!python/object:nerfstudio.cameras.camera_optimizers.CameraOptimizerConfig
      _target: !!python/name:nerfstudio.cameras.camera_optimizers.CameraOptimizer ''
      mode: 'off'
      optimizer: !!python/object:nerfstudio.engine.optimizers.AdamOptimizerConfig
        _target: *id001
        eps: 1.0e-08
        lr: 0.0006
        weight_decay: 0.01
      orientation_noise_std: 0.0
      param_group: camera_opt
      position_noise_std: 0.0
      scheduler: !!python/object:nerfstudio.engine.schedulers.SchedulerConfig
        _target: !!python/name:nerfstudio.engine.schedulers.ExponentialDecaySchedule ''
        lr_final: 5.0e-06
        max_steps: 10000
    camera_res_scale_factor: 1.0
    dataparser: !!python/object:nerfstudio.data.dataparsers.sdfstudio_dataparser.SDFStudioDataParserConfig
      _target: !!python/name:nerfstudio.data.dataparsers.sdfstudio_dataparser.SDFStudio ''
      auto_orient: false
      data: !!python/object/apply:pathlib.PosixPath
      - data
      - polycam
      - conference_room3-processed-new
      downscale_factor: 1
      include_mono_prior: true
      load_pairs: false
      neighbors_num: null
      neighbors_shuffle: false
      pairs_sorted_ascending: true
      scene_scale: 2.0
      skip_every_for_val_split: 30
    eval_image_indices: !!python/tuple
    - 0
    eval_num_images_to_sample_from: -1
    eval_num_rays_per_batch: 1024
    eval_num_times_to_repeat_images: -1
    train_num_images_to_sample_from: 1
    train_num_rays_per_batch: 2048
    train_num_times_to_repeat_images: 0
  model: !!python/object:nerfstudio.models.volsdf.VolSDFModelConfig
    _target: !!python/name:nerfstudio.models.volsdf.VolSDFModel ''
    background_color: black
    background_model: none
    collider_params:
      far_plane: 6.0
      near_plane: 2.0
    eikonal_loss_mult: 0.1
    enable_collider: true
    eval_num_rays_per_chunk: 1024
    far_plane: 4.0
    far_plane_bg: 1000.0
    fg_mask_loss_mult: 0.01
    loss_coefficients:
      rgb_loss_coarse: 1.0
      rgb_loss_fine: 1.0
    min_patch_variance: 0.01
    mono_depth_loss_mult: 0.001
    mono_normal_loss_mult: 0.01
    near_plane: 0.05
    num_samples: 64
    num_samples_eval: 128
    num_samples_extra: 32
    num_samples_outside: 32
    overwrite_near_far_plane: false
    patch_size: 11
    patch_warp_angle_thres: 0.3
    patch_warp_loss_mult: 0.0
    periodic_tvl_mult: 0.0
    sdf_field: !!python/object:nerfstudio.fields.sdf_field.SDFFieldConfig
      _target: !!python/name:nerfstudio.fields.sdf_field.SDFField ''
      appearance_embedding_dim: 32
      beta_init: 0.1
      bias: 0.8
      divide_factor: 2.0
      encoding_type: hash
      geo_feat_dim: 256
      geometric_init: true
      hidden_dim: 256
      hidden_dim_color: 256
      inside_outside: true
      num_layers: 2
      num_layers_color: 2
      use_appearance_embedding: true
      use_grid_feature: true
      weight_norm: true
    topk: 4
    use_average_appearance_embedding: false
timestamp: 2022-12-22_130757
trainer: !!python/object:nerfstudio.configs.base_config.TrainerConfig
  load_config: null
  load_dir: null
  load_scheduler: true
  load_step: null
  max_num_iterations: 200000
  mixed_precision: false
  relative_model_dir: !!python/object/apply:pathlib.PosixPath
  - sdfstudio_models
  save_only_latest_checkpoint: true
  steps_per_eval_all_images: 1000000
  steps_per_eval_batch: 5000
  steps_per_eval_image: 5000
  steps_per_save: 20000
viewer: !!python/object:nerfstudio.configs.base_config.ViewerConfig
  ip_address: 127.0.0.1
  launch_bridge_server: true
  max_num_display_images: 512
  num_rays_per_chunk: 32768
  quit_on_train_completion: false
  relative_log_filename: viewer_log_filename.txt
  skip_openrelay: false
  start_train: true
  websocket_port: 7007
  zmq_port: null
vis: wandb
niujinshuchong commented 1 year ago

@pablovela5620 I think it's because of the large weights of monocular cues. Here is the config

!!python/object:nerfstudio.configs.base_config.Config
data: null
experiment_name: polycam-conference-room
logging: !!python/object:nerfstudio.configs.base_config.LoggingConfig
  enable_profiler: true
  local_writer: !!python/object:nerfstudio.configs.base_config.LocalWriterConfig
    _target: !!python/name:nerfstudio.utils.writer.LocalWriter ''
    enable: true
    max_log_size: 10
    stats_to_track: !!python/tuple
    - !!python/object/apply:nerfstudio.utils.writer.EventName
      - Train Iter (time)
    - !!python/object/apply:nerfstudio.utils.writer.EventName
      - Train Rays / Sec
    - !!python/object/apply:nerfstudio.utils.writer.EventName
      - Test PSNR
    - !!python/object/apply:nerfstudio.utils.writer.EventName
      - Vis Rays / Sec
    - !!python/object/apply:nerfstudio.utils.writer.EventName
      - Test Rays / Sec
  max_buffer_size: 20
  relative_log_dir: !!python/object/apply:pathlib.PosixPath []
  steps_per_log: 10
machine: !!python/object:nerfstudio.configs.base_config.MachineConfig
  dist_url: auto
  machine_rank: 0
  num_gpus: 1
  num_machines: 1
  seed: 42
method_name: monosdf
optimizers:
  field_background:
    optimizer: !!python/object:nerfstudio.engine.optimizers.AdamOptimizerConfig
      _target: &id001 !!python/name:torch.optim.adam.Adam ''
      eps: 1.0e-15
      lr: 0.0005
      weight_decay: 0
    scheduler: !!python/object:nerfstudio.engine.schedulers.ExponentialSchedulerConfig
      _target: &id002 !!python/name:torch.optim.lr_scheduler.ExponentialLR ''
      decay_rate: 0.1
      max_steps: 200000
  fields:
    optimizer: !!python/object:nerfstudio.engine.optimizers.AdamOptimizerConfig
      _target: *id001
      eps: 1.0e-15
      lr: 0.0005
      weight_decay: 0
    scheduler: !!python/object:nerfstudio.engine.schedulers.ExponentialSchedulerConfig
      _target: *id002
      decay_rate: 0.1
      max_steps: 200000
output_dir: !!python/object/apply:pathlib.PosixPath
- outputs
pipeline: !!python/object:nerfstudio.pipelines.base_pipeline.VanillaPipelineConfig
  _target: !!python/name:nerfstudio.pipelines.base_pipeline.VanillaPipeline ''
  datamanager: !!python/object:nerfstudio.data.datamanagers.base_datamanager.VanillaDataManagerConfig
    _target: !!python/name:nerfstudio.data.datamanagers.base_datamanager.VanillaDataManager ''
    camera_optimizer: !!python/object:nerfstudio.cameras.camera_optimizers.CameraOptimizerConfig
      _target: !!python/name:nerfstudio.cameras.camera_optimizers.CameraOptimizer ''
      mode: 'off'
      optimizer: !!python/object:nerfstudio.engine.optimizers.AdamOptimizerConfig
        _target: *id001
        eps: 1.0e-08
        lr: 0.0006
        weight_decay: 0.01
      orientation_noise_std: 0.0
      param_group: camera_opt
      position_noise_std: 0.0
      scheduler: !!python/object:nerfstudio.engine.schedulers.SchedulerConfig
        _target: !!python/name:nerfstudio.engine.schedulers.ExponentialDecaySchedule ''
        lr_final: 5.0e-06
        max_steps: 10000
    camera_res_scale_factor: 1.0
    dataparser: !!python/object:nerfstudio.data.dataparsers.sdfstudio_dataparser.SDFStudioDataParserConfig
      _target: !!python/name:nerfstudio.data.dataparsers.sdfstudio_dataparser.SDFStudio ''
      auto_orient: false
      data: !!python/object/apply:pathlib.PosixPath
      - data
      - test-colmap
      downscale_factor: 1
      include_mono_prior: true
      load_pairs: false
      neighbors_num: null
      neighbors_shuffle: false
      pairs_sorted_ascending: true
      scene_scale: 2.0
      skip_every_for_val_split: 30
    eval_image_indices: !!python/tuple
    - 0
    eval_num_images_to_sample_from: -1
    eval_num_rays_per_batch: 1024
    eval_num_times_to_repeat_images: -1
    train_num_images_to_sample_from: 1
    train_num_rays_per_batch: 2048
    train_num_times_to_repeat_images: 0
  model: !!python/object:nerfstudio.models.volsdf.VolSDFModelConfig
    _target: !!python/name:nerfstudio.models.volsdf.VolSDFModel ''
    background_color: black
    background_model: none
    collider_params:
      far_plane: 6.0
      near_plane: 2.0
    eikonal_loss_mult: 0.1
    enable_collider: true
    eval_num_rays_per_chunk: 1024
    far_plane: 4.0
    far_plane_bg: 1000.0
    fg_mask_loss_mult: 0.01
    loss_coefficients:
      rgb_loss_coarse: 1.0
      rgb_loss_fine: 1.0
    min_patch_variance: 0.01
    mono_depth_loss_mult: 0.1
    mono_normal_loss_mult: 0.05
    near_plane: 0.05
    num_samples: 64
    num_samples_eval: 128
    num_samples_extra: 32
    num_samples_outside: 32
    overwrite_near_far_plane: false
    patch_size: 11
    patch_warp_angle_thres: 0.3
    patch_warp_loss_mult: 0.0
    periodic_tvl_mult: 0.0
    sdf_field: !!python/object:nerfstudio.fields.sdf_field.SDFFieldConfig
      _target: !!python/name:nerfstudio.fields.sdf_field.SDFField ''
      appearance_embedding_dim: 32
      beta_init: 0.1
      bias: 0.8
      divide_factor: 2.0
      encoding_type: hash
      geo_feat_dim: 256
      geometric_init: true
      hidden_dim: 256
      hidden_dim_color: 256
      inside_outside: true
      num_layers: 2
      num_layers_color: 2
      use_appearance_embedding: true
      use_grid_feature: true
      weight_norm: true
    topk: 4
    use_average_appearance_embedding: false
timestamp: 2022-12-22_185647
trainer: !!python/object:nerfstudio.configs.base_config.TrainerConfig
  load_config: null
  load_dir: null
  load_scheduler: true
  load_step: null
  max_num_iterations: 200000
  mixed_precision: false
  relative_model_dir: !!python/object/apply:pathlib.PosixPath
  - sdfstudio_models
  save_only_latest_checkpoint: true
  steps_per_eval_all_images: 1000000
  steps_per_eval_batch: 5000
  steps_per_eval_image: 5000
  steps_per_save: 20000
viewer: !!python/object:nerfstudio.configs.base_config.ViewerConfig
  ip_address: 127.0.0.1
  launch_bridge_server: true
  max_num_display_images: 512
  num_rays_per_chunk: 32768
  quit_on_train_completion: false
  relative_log_filename: viewer_log_filename.txt
  skip_openrelay: false
  start_train: true
  websocket_port: 7007
  zmq_port: null
vis: wandb
pyramidpoint commented 1 year ago

@niujinshuchong, hello , I use process_nerfstudio_to_sdfstudio.py to transform from nerfstudio data to sdfstudio data, and train it with mono-neus model, but I have some problems.

  1. Loss exception 截屏2022-12-30 下午5 12 51

    2.viewer.websocket.port Black screen

    截屏2022-12-30 下午5 14 35
niujinshuchong commented 1 year ago

@pyramidpoint Do you think the camera poses are correct in the viewer? The normalization in process_nerfstudio_to_sdfstudio.py is used for indoor scenes by default, maybe you need to adapt it to your dataset.

raw5 commented 1 year ago

Thanks for the great work. I have been trying to get scripts/datasets/process_nerfstudio_to_sdfstudio.py to work with data processed with Nerfstudio and Colmap. The script keeps trying to find the depth_path which in this case does not exist. Any idea what am I missing here?

python scripts/datasets/process_nerfstudio_to_sdfstudio.py --data ~/sdfdata/hand/hand-processed/colmap --output-dir ~/sdfdata/hand/hand-processed/colmap-sdf --type colmap --geo-type mono_prior --omnidata_path /home/kasm-user/omnidata/omnidata_tools/torch --pretrained_models /home/kasm-user/omnidata/omnidata_tools/pretrained_models
Traceback (most recent call last):
  File "scripts/datasets/process_nerfstudio_to_sdfstudio.py", line 265, in <module>
    main()
  File "scripts/datasets/process_nerfstudio_to_sdfstudio.py", line 190, in main
    depth_path = depth_paths[idx]
IndexError: list index out of range
niujinshuchong commented 1 year ago

@raw5 Thanks for reporting this. Currently the script assume depth maps always exists which is not always the case. @pablovela5620 Are there any chance that you could fix it? Thanks.

pablovela5620 commented 1 year ago

Yep can do that, sorry about that! I only tested with Polycam and forgot to check Colmap

refiksoyak commented 1 year ago

Hi,

I try to run it with nerfstudio's demo data, poster.

First, I download the data and convert it to sdfstudio format using scripts/datasets/process_nerfstudio_to_sdfstudio.py --type colmap. Then, train a NeuS-facto model using ns-train neus-facto --pipeline.model.sdf-field.inside-outside False --experiment-name neus-facto-poster sdfstudio-data --data data/sdf_poster command. After training done, I try to extract meshes, but get the following error:

Traceback (most recent call last): File "/home/hpc/iwfa/iwfa008h/miniconda3/envs/sdfstudio/bin/ns-extract-mesh", line 8, in sys.exit(entrypoint()) File "/home/woody/iwfa/iwfa008h/ref-sdf/sdfstudio/scripts/extract_mesh.py", line 79, in entrypoint tyro.cli(tyro.conf.FlagConversionOff[ExtractMesh]).main() File "/home/woody/iwfa/iwfa008h/ref-sdf/sdfstudio/scripts/extract_mesh.py", line 65, in main get_surface_sliding( File "/home/hpc/iwfa/iwfa008h/miniconda3/envs/sdfstudio/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/home/woody/iwfa/iwfa008h/ref-sdf/sdfstudio/nerfstudio/utils/marching_cubes.py", line 159, in get_surface_sliding combined.export(filename) AttributeError: 'list' object has no attribute 'export'

Here is the config:

!!python/object:nerfstudio.configs.base_config.Config
data: null
experiment_name: neus-facto-poster
logging: !!python/object:nerfstudio.configs.base_config.LoggingConfig
  enable_profiler: true
  local_writer: !!python/object:nerfstudio.configs.base_config.LocalWriterConfig
    _target: !!python/name:nerfstudio.utils.writer.LocalWriter ''
    enable: true
    max_log_size: 10
    stats_to_track: !!python/tuple
    - !!python/object/apply:nerfstudio.utils.writer.EventName
      - Train Iter (time)
    - !!python/object/apply:nerfstudio.utils.writer.EventName
      - Train Rays / Sec
    - !!python/object/apply:nerfstudio.utils.writer.EventName
      - Test PSNR
    - !!python/object/apply:nerfstudio.utils.writer.EventName
      - Vis Rays / Sec
    - !!python/object/apply:nerfstudio.utils.writer.EventName
      - Test Rays / Sec
  max_buffer_size: 20
  relative_log_dir: !!python/object/apply:pathlib.PosixPath []
  steps_per_log: 10
machine: !!python/object:nerfstudio.configs.base_config.MachineConfig
  dist_url: auto
  machine_rank: 0
  num_gpus: 1
  num_machines: 1
  seed: 42
method_name: neus-facto
optimizers:
  field_background:
    optimizer: !!python/object:nerfstudio.engine.optimizers.AdamOptimizerConfig
      _target: &id001 !!python/name:torch.optim.adam.Adam ''
      eps: 1.0e-15
      lr: 0.0005
      weight_decay: 0
    scheduler: !!python/object:nerfstudio.engine.schedulers.NeuSSchedulerConfig
      _target: &id002 !!python/name:nerfstudio.engine.schedulers.NeuSScheduler ''
      learning_rate_alpha: 0.05
      max_steps: 20000
      warm_up_end: 500
  fields:
    optimizer: !!python/object:nerfstudio.engine.optimizers.AdamOptimizerConfig
      _target: *id001
      eps: 1.0e-15
      lr: 0.0005
      weight_decay: 0
    scheduler: !!python/object:nerfstudio.engine.schedulers.NeuSSchedulerConfig
      _target: *id002
      learning_rate_alpha: 0.05
      max_steps: 20000
      warm_up_end: 500
  proposal_networks:
    optimizer: !!python/object:nerfstudio.engine.optimizers.AdamOptimizerConfig
      _target: *id001
      eps: 1.0e-15
      lr: 0.01
      weight_decay: 0
    scheduler: !!python/object:nerfstudio.engine.schedulers.MultiStepSchedulerConfig
      _target: !!python/name:torch.optim.lr_scheduler.MultiStepLR ''
      max_steps: 20000
output_dir: !!python/object/apply:pathlib.PosixPath
- outputs
pipeline: !!python/object:nerfstudio.pipelines.base_pipeline.VanillaPipelineConfig
  _target: !!python/name:nerfstudio.pipelines.base_pipeline.VanillaPipeline ''
  datamanager: !!python/object:nerfstudio.data.datamanagers.base_datamanager.VanillaDataManagerConfig
    _target: !!python/name:nerfstudio.data.datamanagers.base_datamanager.VanillaDataManager ''
    camera_optimizer: !!python/object:nerfstudio.cameras.camera_optimizers.CameraOptimizerConfig
      _target: !!python/name:nerfstudio.cameras.camera_optimizers.CameraOptimizer ''
      mode: 'off'
      optimizer: !!python/object:nerfstudio.engine.optimizers.AdamOptimizerConfig
        _target: *id001
        eps: 1.0e-08
        lr: 0.0006
        weight_decay: 0.01
      orientation_noise_std: 0.0
      param_group: camera_opt
      position_noise_std: 0.0
      scheduler: !!python/object:nerfstudio.engine.schedulers.SchedulerConfig
        _target: !!python/name:nerfstudio.engine.schedulers.ExponentialDecaySchedule ''
        lr_final: 5.0e-06
        max_steps: 10000
    camera_res_scale_factor: 1.0
    dataparser: !!python/object:nerfstudio.data.dataparsers.sdfstudio_dataparser.SDFStudioDataParserConfig
      _target: !!python/name:nerfstudio.data.dataparsers.sdfstudio_dataparser.SDFStudio ''
      auto_orient: false
      data: !!python/object/apply:pathlib.PosixPath
      - data
      - sdf_poster
      downscale_factor: 1
      include_foreground_mask: false
      include_mono_prior: false
      include_sensor_depth: false
      include_sfm_points: false
      load_pairs: false
      neighbors_num: null
      neighbors_shuffle: false
      pairs_sorted_ascending: true
      scene_scale: 2.0
      skip_every_for_val_split: 1
    eval_image_indices: !!python/tuple
    - 0
    eval_num_images_to_sample_from: -1
    eval_num_rays_per_batch: 1024
    eval_num_times_to_repeat_images: -1
    train_num_images_to_sample_from: -1
    train_num_rays_per_batch: 2048
    train_num_times_to_repeat_images: -1
  model: !!python/object:nerfstudio.models.neus_facto.NeuSFactoModelConfig
    _target: !!python/name:nerfstudio.models.neus_facto.NeuSFactoModel ''
    background_color: black
    background_model: none
    base_variance: 64
    collider_params:
      far_plane: 6.0
      near_plane: 2.0
    eikonal_loss_mult: 0.1
    enable_collider: true
    eval_num_rays_per_chunk: 1024
    far_plane: 4.0
    far_plane_bg: 1000.0
    fg_mask_loss_mult: 0.01
    interlevel_loss_mult: 1.0
    loss_coefficients:
      rgb_loss_coarse: 1.0
      rgb_loss_fine: 1.0
    min_patch_variance: 0.01
    mono_depth_loss_mult: 0.0
    mono_normal_loss_mult: 0.0
    near_plane: 0.05
    num_neus_samples_per_ray: 48
    num_proposal_iterations: 2
    num_proposal_samples_per_ray: !!python/tuple
    - 256
    - 96
    num_samples: 64
    num_samples_importance: 64
    num_samples_outside: 32
    num_up_sample_steps: 4
    overwrite_near_far_plane: false
    patch_size: 11
    patch_warp_angle_thres: 0.3
    patch_warp_loss_mult: 0.0
    periodic_tvl_mult: 0.0
    perturb: true
    proposal_net_args_list:
    - hidden_dim: 16
      log2_hashmap_size: 17
      max_res: 64
      num_levels: 5
    - hidden_dim: 16
      log2_hashmap_size: 17
      max_res: 256
      num_levels: 5
    proposal_update_every: 5
    proposal_warmup: 5000
    proposal_weights_anneal_max_num_iters: 1000
    proposal_weights_anneal_slope: 10.0
    sdf_field: !!python/object:nerfstudio.fields.sdf_field.SDFFieldConfig
      _target: !!python/name:nerfstudio.fields.sdf_field.SDFField ''
      appearance_embedding_dim: 32
      beta_init: 0.3
      bias: 0.5
      divide_factor: 2.0
      encoding_type: hash
      geo_feat_dim: 256
      geometric_init: true
      hidden_dim: 256
      hidden_dim_color: 256
      inside_outside: false
      num_layers: 2
      num_layers_color: 2
      use_appearance_embedding: false
      use_grid_feature: true
      weight_norm: true
    sensor_depth_freespace_loss_mult: 0.0
    sensor_depth_l1_loss_mult: 0.0
    sensor_depth_sdf_loss_mult: 0.0
    sensor_depth_truncation: 0.015
    sparse_points_sdf_loss_mult: 0.0
    topk: 4
    use_average_appearance_embedding: false
    use_proposal_weight_anneal: true
    use_same_proposal_network: false
    use_single_jitter: true
timestamp: 2023-01-14_161555
trainer: !!python/object:nerfstudio.configs.base_config.TrainerConfig
  load_config: null
  load_dir: null
  load_scheduler: true
  load_step: null
  max_num_iterations: 20001
  mixed_precision: false
  relative_model_dir: !!python/object/apply:pathlib.PosixPath
  - sdfstudio_models
  save_only_latest_checkpoint: true
  steps_per_eval_all_images: 1000000
  steps_per_eval_batch: 5000
  steps_per_eval_image: 5000
  steps_per_save: 20000
viewer: !!python/object:nerfstudio.configs.base_config.ViewerConfig
  ip_address: 127.0.0.1
  launch_bridge_server: true
  max_num_display_images: 512
  num_rays_per_chunk: 32768
  quit_on_train_completion: false
  relative_log_filename: viewer_log_filename.txt
  skip_openrelay: false
  start_train: true
  websocket_port: 7007
  zmq_port: null
vis: viewer

I did same steps for other custom datasets, but got same error. Do you have any idea about what goes wrong?

niujinshuchong commented 1 year ago

@refiksoyak Did you visualise the training process in viewer or wandb to first check whether it can get something reasonable? Or maybe there are some problem with trimmesh version.

refiksoyak commented 1 year ago

@niujinshuchong I use trimesh==3.18.0. Is that correct? I train in a server, so cannot use viewer, but enabled wandb, and I got the following error during evaluation:

Step (% Done)       Train Iter (time)    Train Rays / Sec
--------------------------------------------------------------
4900 (24.50%)       71.126 ms            28.87 K
4910 (24.55%)       71.409 ms            28.73 K
4920 (24.60%)       71.917 ms            28.52 K
4930 (24.65%)       71.231 ms            28.78 K
4940 (24.70%)       70.869 ms            28.95 K
4950 (24.75%)       70.156 ms            29.24 K
4960 (24.80%)       70.723 ms            29.01 K
4970 (24.85%)       71.441 ms            28.73 K
4980 (24.90%)       71.626 ms            28.66 K
4990 (24.95%)       71.599 ms            28.66 K
Printing profiling stats, from longest to shortest duration in seconds
Trainer.train_iteration: 0.0697
VanillaPipeline.get_train_loss_dict: 0.0340
VanillaPipeline.get_eval_loss_dict: 0.0251
Trainer.eval_iteration: 0.0000
Traceback (most recent call last):
  File "/home/hpc/iwfa/iwfa008h/miniconda3/envs/sdfstudio/bin/ns-train", line 8, in <module>
    sys.exit(entrypoint())
  File "/home/woody/iwfa/iwfa008h/ref-sdf/sdfstudio/scripts/train.py", line 248, in entrypoint
    main(
  File "/home/woody/iwfa/iwfa008h/ref-sdf/sdfstudio/scripts/train.py", line 234, in main
    launch(
  File "/home/woody/iwfa/iwfa008h/ref-sdf/sdfstudio/scripts/train.py", line 173, in launch
    main_func(local_rank=0, world_size=world_size, config=config)
  File "/home/woody/iwfa/iwfa008h/ref-sdf/sdfstudio/scripts/train.py", line 88, in train_loop
    trainer.train()
  File "/home/woody/iwfa/iwfa008h/ref-sdf/sdfstudio/nerfstudio/engine/trainer.py", line 174, in train
    self.eval_iteration(step)
  File "/home/woody/iwfa/iwfa008h/ref-sdf/sdfstudio/nerfstudio/utils/decorators.py", line 70, in wrapper
    ret = func(self, *args, **kwargs)
  File "/home/woody/iwfa/iwfa008h/ref-sdf/sdfstudio/nerfstudio/utils/profiler.py", line 43, in wrapper
    ret = func(*args, **kwargs)
  File "/home/woody/iwfa/iwfa008h/ref-sdf/sdfstudio/nerfstudio/engine/trainer.py", line 347, in eval_iteration
    metrics_dict, images_dict = self.pipeline.get_eval_image_metrics_and_images(step=step)
  File "/home/woody/iwfa/iwfa008h/ref-sdf/sdfstudio/nerfstudio/utils/profiler.py", line 43, in wrapper
    ret = func(*args, **kwargs)
  File "/home/woody/iwfa/iwfa008h/ref-sdf/sdfstudio/nerfstudio/pipelines/base_pipeline.py", line 311, in get_eval_image_metrics_and_images
    metrics_dict, images_dict = self.model.get_image_metrics_and_images(outputs, batch)
  File "/home/woody/iwfa/iwfa008h/ref-sdf/sdfstudio/nerfstudio/models/neus_facto.py", line 202, in get_image_metrics_and_images
    prop_depth_i = colormaps.apply_depth_colormap(
  File "/home/woody/iwfa/iwfa008h/ref-sdf/sdfstudio/nerfstudio/utils/colormaps.py", line 74, in apply_depth_colormap
    colored_image = apply_colormap(depth, cmap=cmap)
  File "/home/woody/iwfa/iwfa008h/ref-sdf/sdfstudio/nerfstudio/utils/colormaps.py", line 42, in apply_colormap
    assert image_long_min >= 0, f"the min value is {image_long_min}"
AssertionError: the min value is -9223372036854775808
niujinshuchong commented 1 year ago

@refiksoyak The training failed. I haven't tested this scene. Could you try --pipeline.model.sdf-field.inside-outside True or try to use an MLP architecture?

niujinshuchong commented 1 year ago

@refiksoyak I quickly tested and found that the using monocular prior can get reasonable results. However, after pose normalization in the preprocessing script, large part of the scene are outside the unit bbox. In this case, background model should be used.

First, make sure you extract the mono prior with the script

python scripts/datasets/process_nerfstudio_to_sdfstudio.py --type colmap --data data/nerfstudio/posters_v3/ --output-dir data/sdfstudio/poster --indoor

And then train NeuS-facto with mono prior and background model

ns-train neus-facto --pipeline.model.sdf-field.inside-outside True --pipeline.model.sdf-field.bias 0.8 --pipeline.model.mono-depth-loss-mult 0.1 --pipeline.model.mono-normal-loss-mult 0.05 --pipeline.model.background-model mlp sdfstudio-data --data data/sdfstudio/poster --auto-orient True --include-mono-prior True

Then you will get something like this in the viewer

image

I think you could adapt the processing script for this scene to get better results. The viewer could be used remotely and you just need to forward the port with

ssh -L 7007:localhost:7007 <username>@<remote-machine-ip>
Serge3006 commented 1 year ago

Hello, first of all, thank you for the great work. I was trying to get it to work with my own data but it seems that I'm not getting something because the results I get are weird, and not as good as with the DTU dataset.

My dataset and the meta json file are located here: https://drive.google.com/drive/folders/11HbD5ZxMkp9MrL-B49FswCUW3TIkTV1_?usp=sharing

I'm able to execute the whole pipeline:

  1. Generate the nerf-studio format using the ns-process-data ... script.
  2. Transform the nerf-studio format to the sdf-studio one using the process_nerfstudio_to_sdftudio.py script. My data is composed only by images, I don't not compute neither normals not depths: python scripts/datasets/process_nerfstudio_to_sdfstudio.py --data sdf-datasets/coke-processed/ --data-type colmap --scene-type indoor --output-dir datasets/sdfstudio-dataset/coke-sdf-indoor

(I've tried different configurations of these script, the results are more or less the same)

  1. Train neus-facto with the following config: ns-train neus-facto --pipeline.model.sdf-field.inside-outside False --experiment-name neus-facto-coke-object-inside-outside-false sdfstudio-data --data datasets/sdf-datasets/coke-sdf-object/
  2. Generate the mesh: image As you can see the can is there but it is surrounded by a lot of background and also the quality of the can object is not good.

Could it be something related to the processing of the data before running the training process?

Do you use masks for the optimization process?

niujinshuchong commented 1 year ago

@Serge3006 Your images has complex background. Maybe you could enable background model and use foreground mask with --pipeline.model.background-model mlp --pipeline.model.fg-mask-loss-mult 1.0. The other (and maybe better) way it to manually create training images with clean background

clean_images = images * mask + (1. - mask) # white background
yaoyuan13 commented 1 year ago

Hello, first of all, thank you for the great work. I was trying to get it to work with my own data but it seems that I'm not getting something because the results I get are weird, and not as good as with the DTU dataset.

My dataset and the meta json file are located here: https://drive.google.com/drive/folders/11HbD5ZxMkp9MrL-B49FswCUW3TIkTV1_?usp=sharing

I'm able to execute the whole pipeline:

  1. Generate the nerf-studio format using the ns-process-data ... script.
  2. Transform the nerf-studio format to the sdf-studio one using the process_nerfstudio_to_sdftudio.py script. My data is composed only by images, I don't not compute neither normals not depths: python scripts/datasets/process_nerfstudio_to_sdfstudio.py --data sdf-datasets/coke-processed/ --data-type colmap --scene-type indoor --output-dir datasets/sdfstudio-dataset/coke-sdf-indoor

(I've tried different configurations of these script, the results are more or less the same) 3. Train neus-facto with the following config: ns-train neus-facto --pipeline.model.sdf-field.inside-outside False --experiment-name neus-facto-coke-object-inside-outside-false sdfstudio-data --data datasets/sdf-datasets/coke-sdf-object/ 4. Generate the mesh: image As you can see the can is there but it is surrounded by a lot of background and also the quality of the can object is not good.

Could it be something related to the processing of the data before running the training process?

Do you use masks for the optimization process?

hi I used your data and directly run training code, but the rendered mesh has no coke yet, do you have any other revisions with your uploaded data?

ChicoChen commented 1 year ago

hello, I trained my bakedSDF model on custom dataset with following command: ns-train bakedsdf --vis wandb --experiment-name phoenix-bakedsdf --data data/nerfstudio-data-mipnerf360/phoenix --pipeline.datamanager.camera-res-scale-factor 0.25 --pipeline.datamanager.train-num-rays-per-batch 5000 --pipeline.model.sdf-field.inside-outside True --pipeline.model.scene-contraction-norm l2 mipnerf360-data

and some problem occur when I tried to extract mesh: ns-extract-mesh --load-config outputs/phoenix-bakedsdf/bakedsdf/2023-03-15_214055/config.yml --output-path meshes/phoenix_mesh_1024.ply --output-path phoenix_1204.ply --bounding-box-min -2.0 -2.0 -2.0 --bounding-box-max 2.0 2.0 2.0 --resolution 1024 --marching_cube_threshold 0.001 --create_visibility_mask True

1 1 0 tensor(0.0007, device='cuda:0') torch.Size([134217728]) torch.Size([133750, 3]) torch.Size([134217728, 3]) 1 1 1 tensor(0.0011, device='cuda:0') torch.Size([134217728]) torch.Size([208082, 3]) torch.Size([134217728, 3])

Traceback (most recent call last): File "/home/undergrad/anaconda3/envs/sdfstudio/bin/ns-extract-mesh", line 8, in sys.exit(entrypoint()) File "/home/undergrad/sdfstudio/scripts/extract_mesh.py", line 137, in entrypoint tyro.cli(tyro.conf.FlagConversionOff[ExtractMesh]).main() File "/home/undergrad/sdfstudio/scripts/extract_mesh.py", line 93, in main get_surface_sliding_with_contraction( File "/home/undergrad/sdfstudio/nerfstudio/utils/marching_cubes.py", line 324, in get_surface_sliding_with_contraction combined.vertices = inv_contraction(torch.from_numpy(combined.vertices)).numpy() AttributeError: 'list' object has no attribute 'vertices'

I have little idea on what cause the error, here's the config file

!!python/object:nerfstudio.configs.base_config.Config
data: &id002 !!python/object/apply:pathlib.PosixPath
- data
- nerfstudio-data-mipnerf360
- phoenix
experiment_name: phoenix-bakedsdf
logging: !!python/object:nerfstudio.configs.base_config.LoggingConfig
  enable_profiler: true
  local_writer: !!python/object:nerfstudio.configs.base_config.LocalWriterConfig
    _target: !!python/name:nerfstudio.utils.writer.LocalWriter ''
    enable: true
    max_log_size: 10
    stats_to_track: !!python/tuple
    - !!python/object/apply:nerfstudio.utils.writer.EventName
      - Train Iter (time)
    - !!python/object/apply:nerfstudio.utils.writer.EventName
      - Train Rays / Sec
    - !!python/object/apply:nerfstudio.utils.writer.EventName
      - Test PSNR
    - !!python/object/apply:nerfstudio.utils.writer.EventName
      - Vis Rays / Sec
    - !!python/object/apply:nerfstudio.utils.writer.EventName
      - Test Rays / Sec
  max_buffer_size: 20
  relative_log_dir: !!python/object/apply:pathlib.PosixPath []
  steps_per_log: 10
machine: !!python/object:nerfstudio.configs.base_config.MachineConfig
  dist_url: auto
  machine_rank: 0
  num_gpus: 1
  num_machines: 1
  seed: 42
method_name: bakedsdf
optimizers:
  fields:
    optimizer: !!python/object:nerfstudio.engine.optimizers.AdamOptimizerConfig
      _target: &id001 !!python/name:torch.optim.adam.Adam ''
      eps: 1.0e-15
      lr: 0.01
      weight_decay: 0
    scheduler: !!python/object:nerfstudio.engine.schedulers.NeuSSchedulerConfig
      _target: !!python/name:nerfstudio.engine.schedulers.NeuSScheduler ''
      learning_rate_alpha: 0.05
      max_steps: 250000
      warm_up_end: 500
  proposal_networks:
    optimizer: !!python/object:nerfstudio.engine.optimizers.AdamOptimizerConfig
      _target: *id001
      eps: 1.0e-15
      lr: 0.01
      weight_decay: 0
    scheduler: !!python/object:nerfstudio.engine.schedulers.MultiStepSchedulerConfig
      _target: !!python/name:torch.optim.lr_scheduler.MultiStepLR ''
      max_steps: 250000
output_dir: !!python/object/apply:pathlib.PosixPath
- outputs
pipeline: !!python/object:nerfstudio.pipelines.base_pipeline.VanillaPipelineConfig
  _target: !!python/name:nerfstudio.pipelines.base_pipeline.VanillaPipeline ''
  datamanager: !!python/object:nerfstudio.data.datamanagers.base_datamanager.VanillaDataManagerConfig
    _target: !!python/name:nerfstudio.data.datamanagers.base_datamanager.VanillaDataManager ''
    camera_optimizer: !!python/object:nerfstudio.cameras.camera_optimizers.CameraOptimizerConfig
      _target: !!python/name:nerfstudio.cameras.camera_optimizers.CameraOptimizer ''
      mode: 'off'
      optimizer: !!python/object:nerfstudio.engine.optimizers.AdamOptimizerConfig
        _target: *id001
        eps: 1.0e-08
        lr: 0.0006
        weight_decay: 0.01
      orientation_noise_std: 0.0
      param_group: camera_opt
      position_noise_std: 0.0
      scheduler: !!python/object:nerfstudio.engine.schedulers.SchedulerConfig
        _target: !!python/name:nerfstudio.engine.schedulers.ExponentialDecaySchedule ''
        lr_final: 5.0e-06
        max_steps: 10000
    camera_res_scale_factor: 0.25
    dataparser: !!python/object:nerfstudio.data.dataparsers.mipnerf360_dataparser.Mipnerf360DataParserConfig
      _target: !!python/name:nerfstudio.data.dataparsers.mipnerf360_dataparser.Mipnerf360 ''
      auto_scale_poses: true
      center_poses: true
      data: *id002
      downscale_factor: null
      eval_interval: 8
      orientation_method: up
      scale_factor: 1.0
      scene_scale: 1.0
    eval_image_indices: !!python/tuple
    - 0
    eval_num_images_to_sample_from: -1
    eval_num_rays_per_batch: 1024
    eval_num_times_to_repeat_images: -1
    train_num_images_to_sample_from: -1
    train_num_rays_per_batch: 5000
    train_num_times_to_repeat_images: -1
  model: !!python/object:nerfstudio.models.bakedsdf.BakedSDFModelConfig
    _target: !!python/name:nerfstudio.models.bakedsdf.BakedSDFFactoModel ''
    background_color: black
    background_model: none
    beta_anneal_max_num_iters: 250000
    collider_params:
      far_plane: 6.0
      near_plane: 2.0
    eikonal_anneal_max_num_iters: 250000
    eikonal_loss_mult: 0.01
    enable_collider: true
    eval_num_rays_per_chunk: 1024
    far_plane: 1000.0
    far_plane_bg: 1000.0
    fg_mask_loss_mult: 0.01
    interlevel_loss_mult: 1.0
    loss_coefficients:
      rgb_loss_coarse: 1.0
      rgb_loss_fine: 1.0
    min_patch_variance: 0.01
    mono_depth_loss_mult: 0.0
    mono_normal_loss_mult: 0.0
    near_plane: 0.2
    num_neus_samples_per_ray: 48
    num_proposal_iterations: 2
    num_proposal_samples_per_ray: !!python/tuple
    - 256
    - 96
    num_samples: 64
    num_samples_eval: 128
    num_samples_extra: 32
    num_samples_outside: 32
    overwrite_near_far_plane: true
    patch_size: 11
    patch_warp_angle_thres: 0.3
    patch_warp_loss_mult: 0.0
    periodic_tvl_mult: 0.0
    proposal_net_args_list:
    - hidden_dim: 16
      log2_hashmap_size: 17
      max_res: 64
      num_levels: 5
    - hidden_dim: 16
      log2_hashmap_size: 17
      max_res: 256
      num_levels: 5
    proposal_update_every: 5
    proposal_warmup: 5000
    proposal_weights_anneal_max_num_iters: 1000
    proposal_weights_anneal_slope: 10.0
    scene_contraction_norm: l2
    sdf_field: !!python/object:nerfstudio.fields.sdf_field.SDFFieldConfig
      _target: !!python/name:nerfstudio.fields.sdf_field.SDFField ''
      appearance_embedding_dim: 32
      beta_init: 0.1
      bias: 0.05
      divide_factor: 2.0
      encoding_type: hash
      geo_feat_dim: 256
      geometric_init: true
      hidden_dim: 256
      hidden_dim_color: 256
      inside_outside: true
      num_layers: 2
      num_layers_color: 2
      off_axis: true
      position_encoding_max_degree: 8
      rgb_padding: 0.001
      use_appearance_embedding: false
      use_diffuse_color: true
      use_grid_feature: true
      use_n_dot_v: true
      use_reflections: true
      use_specular_tint: true
      weight_norm: true
    sensor_depth_freespace_loss_mult: 0.0
    sensor_depth_l1_loss_mult: 0.0
    sensor_depth_sdf_loss_mult: 0.0
    sensor_depth_truncation: 0.015
    sparse_points_sdf_loss_mult: 0.0
    topk: 4
    use_anneal_beta: true
    use_anneal_eikonal_weight: false
    use_average_appearance_embedding: false
    use_proposal_weight_anneal: true
    use_same_proposal_network: false
    use_single_jitter: true
timestamp: 2023-03-15_214055
trainer: !!python/object:nerfstudio.configs.base_config.TrainerConfig
  load_config: null
  load_dir: null
  load_scheduler: true
  load_step: null
  max_num_iterations: 250001
  mixed_precision: false
  relative_model_dir: !!python/object/apply:pathlib.PosixPath
  - sdfstudio_models
  save_only_latest_checkpoint: true
  steps_per_eval_all_images: 1000000
  steps_per_eval_batch: 5000
  steps_per_eval_image: 5000
  steps_per_save: 20000
viewer: !!python/object:nerfstudio.configs.base_config.ViewerConfig
  ip_address: 127.0.0.1
  launch_bridge_server: true
  max_num_display_images: 512
  num_rays_per_chunk: 32768
  quit_on_train_completion: false
  relative_log_filename: viewer_log_filename.txt
  skip_openrelay: false
  start_train: true
  websocket_port: 7007
  zmq_port: null
vis: wandb
niujinshuchong commented 1 year ago

@ChicoChen Not sure if it's related to trimesh version. I am using 3.15.8. Did you try to visualise the training progress?

ChicoChen commented 1 year ago

@niujinshuchong thanks for the reply, didn't notice that I'm using trimesh 3.20.1 and here are my eval images image well, this is obviously not a good result

niujinshuchong commented 1 year ago

@ChicoChen Your reconstruction seems fail completely. You should check if your camera poses are correct.

ChicoChen commented 1 year ago

Hello, thanks for the advice last time. After some adjustment, the new bakedSDF model seems better

image

but I have some question about rendering a video

  1. I render a video by ns-render-mesh --meshfile MESHFILE.ply --traj spiral --output_path [MyPath] mipnerf360-data --data [MyData] and here's the result. But camera position was fixed above the phoenix statue. what arguments should I set so that the camera can spin around the statue in a horizontal way.

https://user-images.githubusercontent.com/107322822/227220514-1c349dc1-4cc1-4c6b-a56f-39bded2ca332.mp4

  1. Is there a way to render videos which keeps the color of origin scene, like eval Images did image
niujinshuchong commented 1 year ago

@ChicoChen ns-render-mesh supports different types of trajectories. For example, interpolate interpolates training camera pose and ellipse creates ellipse trajectory around scene origin. You could use ns-render to render RGB images.

Le0m1 commented 1 year ago

image I have such a problem

rawmarshmellows commented 12 months ago

I have been trying to run neus-facto with my custom data here

I processed the data like so:

IMAGE_DIR=<path_to_the_unzipped_file_from_download_link>
PROCESSED_COLMAP_OUTPUT_DIR=processed-colmap
PROCESSED_SDFSTUDIO_OUTPUT_DIR=processed-sdf
SCENE_TYPE=unbound

# process data
ns-process-data $IMAGE_DIR --data church --output-dir $PROCESSED_COLMAP_OUTPUT_DIR
python3.8 scripts/datasets/process_nerfstudio_to_sdfstudio.py --data $PROCESSED_COLMAP_OUTPUT_DIR --output-dir $PROCESSED_SDFSTUDIO_OUTPUT_DIR --data-type colmap --scene-type $SCENE_TYPE

# training
ns-train neus-facto --pipeline.model.sdf-field.inside-outside False --vis viewer --experiment-name church-building sdfstudio-data --data $PROCESSED_SDFSTUDIO_OUTPUT_DIR

The issue is the training gets stuck after ~100 iterations:

image

I have tried to visualize the training process in the viewer:

image

I was able to open the matched points from the images generated from CloudCompare:

image

It seems like the orientation is a bit different, maybe that's the issue?