Totoro97 / NeuS

Code release for NeuS
MIT License
1.58k stars 210 forks source link

Cannot reproduce results using COLMAP on data #8

Closed theFilipko closed 3 years ago

theFilipko commented 3 years ago

Hello, I have been trying to do the following:

  1. take a data, let's say bmvs_stone
  2. feed images to the COLMAP
  3. generate the cameras_sphere.npz
  4. train NeuS on these data

To generate the cameras_sphere.npz I used this code from the IDR project. This code has many times resulted in error because it was not able to find points for normalisation, here. After rerunning the COLMAP a few times, I got 2 points and it was able to generate the scale_mat_xx. However, the results after training are blurry, not good.

I have tried using automatic reconstruction in COLMAP and also this script from the LLFF project.

To convert the output from the COLMAP to camera_sphere.npz I used this:

import numpy as np
import os
import colmap_read_model as read_model
from scipy.spatial.transform import Rotation as R

DIR = "/home/uriel/Downloads/DTU/dtu_scan24"
DST = os.path.join(DIR, "cameras_sphere.npz")

def load_colmap_data(realdir):
    camerasfile = os.path.join(realdir, 'sparse/0/cameras.bin')
    camdata = read_model.read_cameras_binary(camerasfile)

    # cam = camdata[camdata.keys()[0]]
    list_of_keys = list(camdata.keys())
    cam = camdata[list_of_keys[0]]

    h, w, f, cx, cy = cam.height, cam.width, cam.params[0], cam.params[1], cam.params[2]
    # w, h, f = factor * w, factor * h, factor * f
    hwf = np.array([h, w, f]).reshape([3, 1])
    k = np.array([[f, 0, cx],
                  [0, f, cy],
                  [0, 0, 1]])

    imagesfile = os.path.join(realdir, 'sparse/0/images.bin')
    imdata = read_model.read_images_binary(imagesfile)

    w2c_mats = []
    bottom = np.array([0, 0, 0, 1.]).reshape([1, 4])

    names = [imdata[k].name for k in imdata]
    print('Images #', len(names))
    perm = np.argsort(names)
    for i in imdata:
        im = imdata[i]
        R = im.qvec2rotmat()
        t = im.tvec.reshape([3, 1])
        m = np.concatenate([np.concatenate([R, t], 1), bottom], 0)
        w2c_mats.append(m)

    w2c_mats = np.stack(w2c_mats, 0)
    c2w_mats = np.linalg.inv(w2c_mats)

    return k, c2w_mats, perm

k, rts, perm = load_colmap_data(DIR)
cameras = {}
for i in perm:
    r = rts[i, :3, :3]
    t = rts[i, :3, 3]
    wm = np.eye(4)
    wm[:3,:3] = k @ r
    wm[:3,3] = k @ t
    cameras['world_mat_%d' % i] = wm
    cameras['scale_mat_%d' % i] = np.eye(4)

np.savez(DST, **cameras)

The code is inspired by this script.

I think the problem is that COLMAP uses a different coordinate system then NeuS and IDR, which I suppose is the same as OpenGL, where COLMAP says: The local camera coordinate system of an image is defined in a way that the X axis points to the right, the Y axis to the bottom, and the Z axis to the front as seen from the image.

Do you have any ideas how to solve it? Thanks

theFilipko commented 3 years ago

I have figured this out. Here's the working code:

import numpy as np
import os
import colmap_read_model as read_model

DIR = "/home/uriel/Downloads/stone"
DST = os.path.join(DIR, "cameras_sphere.npz")

camerasfile = os.path.join(DIR, 'sparse/0/cameras.bin')
camdata = read_model.read_cameras_binary(camerasfile)
list_of_keys = list(camdata.keys())
cam = camdata[list_of_keys[0]]

h, w, f, cx, cy = cam.height, cam.width, cam.params[0], cam.params[1], cam.params[2]
k = np.array([[f, 0, cx],
              [0, f, cy],
              [0, 0, 1]])

imagesfile = os.path.join(DIR, 'sparse/0/images.bin')
imdata = read_model.read_images_binary(imagesfile)
bottom = np.array([0, 0, 0, 1.]).reshape([1, 4])
names = [imdata[k].name for k in imdata]
perm = np.argsort(names)
cameras = {}
for i in perm:
    im = imdata[i+1]
    r = im.qvec2rotmat()
    t = im.tvec.reshape([3, 1])
    w2c = np.concatenate([np.concatenate([r, t], 1), bottom], 0)
    c2w = np.linalg.inv(w2c)

    r = c2w[:3, :3]
    r = r.T  # because of the load_K_Rt_from_P() function implemented in dataset.py
    # where the decomposed rotation matrix is transposed
    t = c2w[:3, 3]
    t = -t # -t because of the opencv projection
    # matrix decomposition function implementation
    # https://stackoverflow.com/questions/62686618/opencv-decompose-projection-matrix/69556782#69556782

    wm = np.eye(4)
    wm[:3,:3] = k @ r
    wm[:3,3] = k @ r @ t

    cameras['world_mat_%d' % i] = wm
    cameras['scale_mat_%d' % i] = np.eye(4)

np.savez(DST, **cameras)
ahnaf1393 commented 2 years ago

@theFilipko thank you so much for your script. I had 2 questions:

  1. Will this script work to create the cameras.npz file required in the IDR?
  2. Will it work on my own data that I auto reconstruct using COLMAP or does this work only on the DTU dataset?
theFilipko commented 2 years ago

@ahnaf1393 I have not tried it on IDR project. It works on any data reconstructed by COLMAP but you may need to play with COLMAP parameters to get it right.

ahnaf1393 commented 2 years ago

@theFilipko I tried this script with images/videos captured on my mobile phone camera and I am not getting good results. Could you maybe suggest COLMAP which parameters I should be looking into? For starters, which COLMAP camera model would be the best fit for mobile phone images/videos?

theFilipko commented 2 years ago

@ahnaf1393 I recommend you to have a look at the COLMAP tutorial and use the GUI to play with settings of the Feature mapping, extraction and reconstruction. That is how you can have a better understanding of your data.

Terry10086 commented 10 months ago

Hello, I have been trying to do the following:

  1. take a data, let's say bmvs_stone
  2. feed images to the COLMAP
  3. generate the cameras_sphere.npz
  4. train NeuS on these data

To generate the cameras_sphere.npz I used this code from the IDR project. This code has many times resulted in error because it was not able to find points for normalisation, here. After rerunning the COLMAP a few times, I got 2 points and it was able to generate the scale_mat_xx. However, the results after training are blurry, not good.

I have tried using automatic reconstruction in COLMAP and also this script from the LLFF project.

To convert the output from the COLMAP to camera_sphere.npz I used this:

import numpy as np
import os
import colmap_read_model as read_model
from scipy.spatial.transform import Rotation as R

DIR = "/home/uriel/Downloads/DTU/dtu_scan24"
DST = os.path.join(DIR, "cameras_sphere.npz")

def load_colmap_data(realdir):
    camerasfile = os.path.join(realdir, 'sparse/0/cameras.bin')
    camdata = read_model.read_cameras_binary(camerasfile)

    # cam = camdata[camdata.keys()[0]]
    list_of_keys = list(camdata.keys())
    cam = camdata[list_of_keys[0]]

    h, w, f, cx, cy = cam.height, cam.width, cam.params[0], cam.params[1], cam.params[2]
    # w, h, f = factor * w, factor * h, factor * f
    hwf = np.array([h, w, f]).reshape([3, 1])
    k = np.array([[f, 0, cx],
                  [0, f, cy],
                  [0, 0, 1]])

    imagesfile = os.path.join(realdir, 'sparse/0/images.bin')
    imdata = read_model.read_images_binary(imagesfile)

    w2c_mats = []
    bottom = np.array([0, 0, 0, 1.]).reshape([1, 4])

    names = [imdata[k].name for k in imdata]
    print('Images #', len(names))
    perm = np.argsort(names)
    for i in imdata:
        im = imdata[i]
        R = im.qvec2rotmat()
        t = im.tvec.reshape([3, 1])
        m = np.concatenate([np.concatenate([R, t], 1), bottom], 0)
        w2c_mats.append(m)

    w2c_mats = np.stack(w2c_mats, 0)
    c2w_mats = np.linalg.inv(w2c_mats)

    return k, c2w_mats, perm

k, rts, perm = load_colmap_data(DIR)
cameras = {}
for i in perm:
    r = rts[i, :3, :3]
    t = rts[i, :3, 3]
    wm = np.eye(4)
    wm[:3,:3] = k @ r
    wm[:3,3] = k @ t
    cameras['world_mat_%d' % i] = wm
    cameras['scale_mat_%d' % i] = np.eye(4)

np.savez(DST, **cameras)

The code is inspired by this script.

I think the problem is that COLMAP uses a different coordinate system then NeuS and IDR, which I suppose is the same as OpenGL, where COLMAP says: The local camera coordinate system of an image is defined in a way that the X axis points to the right, the Y axis to the bottom, and the Z axis to the front as seen from the image.

Do you have any ideas how to solve it? Thanks

Sorry to bother, but I am really confused about the coordinate. Are you sure the coordinate system of COLMAP is the same as OPENGL, because the coordinate system of OPENGL is the same as NeRF, which the X axis points to the right, the Y axis to the top, and the Z axis to the behind. And NeuS is the same as as OPENGL (I not sure here. I just saw the coordinate in other issue ), but different from opencv. Could you please help me figure it out?