autonomousvision / sdfstudio

A Unified Framework for Surface Reconstruction
Apache License 2.0
1.95k stars 182 forks source link

Convertor? #69

Open cdcseacave opened 1 year ago

cdcseacave commented 1 year ago

Multiple issues, but the most problematic is that there seems to not be a way to import COLMAP or nerfstudio data format into sdfstudio format. I've tried python scripts/datasets/process_nerfstudio_to_sdfstudio.py --data D:\datasets\TanksAndTemples\data\image_sets\Caterpillar\nerf --output-dir D:\datasets\TanksAndTemples\data\image_sets\Caterpillar\sdf --data-type colmap --scene-type object and not only that I had to solve a bunch of importing issues ("the file namings are not corresponding to the ones in nerfstudio output, the depth maps are not imported for COLMAP format), but the output is garbage (nothing can be recognized, just some yellow and blue blobs), even though the nerfstudio output looks amazing.

Ideally there should be a script to directly import from COLMAP as there is no way to get the image pairs data out of the nerfstudio format.

Some other issues:

Thank you for this work, unfortunately it is very hard to test/use if you do not have few days to dedicate overcoming these kind of issues.

niujinshuchong commented 1 year ago

Hi, I will check the script later and make an update if needed. The conversion script is only useful if you want to use monocular depth/normal and depth maps from range sensor.

Nerfstudio's format can be used directly without conversion. You can specify the data parser by for example ns-train volsdf --XXX nerfstudio-data --data XXX.

For other issues:

  1. you could use sdfstudio-data --data XXX --auto_orient True. We keep the default to false such that the extracted mesh is aligned with the original camera poses.
  2. NeuS2 is published in arXiv after our first release. I think NeuS2's results is similar if you use hash-coding in NeuS in the current code base (coarse to fine schedule for the hash coding is missing but I do not observe very bad result without it in DTU dataset).
  3. Faster and convenient for me to make an update (and it's also related to my other ongoing projects).
cdcseacave commented 1 year ago

afaik Nerfstudio's format does not contain image pairs information needed by sdfstudio, is that not needed? it would be hard to make in importer in sdfstudio to directly import from COLMAP? Neus2 in my understanding uses a totally different approach at integrating the SDF in the pipeline and a different loss function, it is not only the hierarchical hashing the difference; did I get this wrong?

niujinshuchong commented 1 year ago

@cdcseacave The pairs information is used if you want to use multi-view consistency loss proposed in Geo-NeuS. It's not needed in NeuS.

What do you mean by "it would be hard to make in importer in sdfstudio to directly import from COLMAP?"

In my understanding, the differences between NeuS and NeuS2 are the hash encoding and progressive training of hash feature grids. NeuS2 proposes a method to compute the 2nd order gradient of hash encoding and compare it with pure pytorch implementation. Now tiny-cuda-nn has official support of 2nd order gradient of hash encoding implemented with cuda but the comparison of it is not shown in NeuS2's paper. I believe both of them are faster than using pytorch but don't know how much difference between NeuS2's method and tiny-cuda-nn's implementation.

shwhjw commented 1 year ago

Not sure if I should make a new issue but this seems related.

I was able to run process_nerfstudio_to_sdfstudio.py to convert my nerfstudio+colmap folder into an sdfstudio-compatible folder of images with depth and normal priors. It seems though that the depth is inverted, with bright green being far and dark blue being near (opposite of the example room dataset which I have working well). I suspect the normals are also flipped.

image image

This is obviously a known issue because of this section in process_nerfstudio_to_sdfstudio.py:

# load poses
# OpenGL/Blender convention, needs to change to COLMAP/OpenCV convention
# https://docs.nerf.studio/en/latest/quickstart/data_conventions.html
# IGNORED for now
c2w = np.array(frame["transform_matrix"]).reshape(4, 4)
c2w[0:3, 1:3] *= -1
poses.append(c2w)

According to the docs, Y and Z are flipped in COLMAP/OpenCV compared to OGL/Blender convention, which would explain why my depth is inverted and I suspect the normals are too in Y and Z.

Is there a quick fix for this? I'm new to NeRFs and Python in general so would rather not have to work this out for myself!

shwhjw commented 1 year ago

I'm going to try uncommenting the "c2w[0:3, 1:3] *= -1" line that flips the Y and Z axes, which I hope will give me correct depth + normals, but I guess it could mess up the rest of the nerf.

shwhjw commented 1 year ago

With that line commented out, the depth and normal images look exactly the same as before. I guess the colouring is due to the pretrained omnidata model? Not sure how to create the images with correct depth colouring.

The normals are correct though, after checking my generated ones vs the sample ones. So why is just the depth reversed?

shwhjw commented 1 year ago

I've found that the "extract_monocular_cues.py" script (referenced by the "process_nerfstudio_to_sdfstudio.py" script) must have been adapted from the demo.py script included with omnidata.

If I run the demo.py script by itself then I get the expected depth images. I can't tell which difference causes the "extract_monocular_cues.py" to result in inverted depths though. python demo.py --task depth --img_path $PATH_TO_IMAGE_OR_FOLDER --output_path $PATH_TO_SAVE_OUTPUT

For now at least i should be able to manually copy the depth images straight from omnidata into the sdfstudio folder with the incorrect depths, and train from there.

omnidata doesn't create the .npy files, so I hope those aren't needed for training (are they a kind of checkpoint?)

niujinshuchong commented 1 year ago

Hi, the script extract_monocular_cues.py is adapted from demo.py from omnidata. The depth images looks inverted while may be due to the color mapping when saving the images. But it's only used for visualization and we use the npy file during training. You don't need to flip the depth or normal images because scale-invariant depth loss is used and and we transform the normal to world coordinate system when loading the data.

shwhjw commented 1 year ago

Great thanks, will give training with the "inverted" images a go on Monday!