How to use render.py to generate a set of rendered images in provided path by myself?

ThePassedWind commented 9 months ago

I have found that images.bin and cameras.bin (COLMAP type) are extrinsics and intrinsics of cameras respectively. And we can use read_extrinsics_binary and read_intrinsics_binary to read them. By default, cameras.bin(intrinsics) is a dict just like:

id:  1
model:  PINHOLE
width:  1912
height:  1075
params:  [ 994.91473032 1035.45535543  956.          537.5       ]

By default, images.bin(intrinsics) is a list whose dict quantity is as same as the quantity of input images, a dict just like:

id:  1
qvec:  [ 9.87182098e-01  4.67789801e-04 -1.59166096e-01 -1.17234620e-02]
tvec:  [ 0.71418639 -0.12678332  2.64190678]
camera_id:  1
name:  frame_0.png
xys:  [[1.00696456e+03 4.94545990e-02]
 [1.02521076e+03 3.03924482e-01]
 [1.14496957e+03 6.49794588e-01]
 ...
 [7.69126962e+02 5.17357767e+02]
 [1.19740032e+03 2.12154789e+02]
 [1.36614270e+03 6.43028234e+02]]
# xys.shape: (8807, 2)
point3D_ids:  [-1 -1 -1 ... -1 -1 -1]
# len(point3D_ids): 8807
#

If I provide a specific path by myself, I have to change images.bin. But I don't know how to edit xys and point3D_ids when the viewpoint is not in the viewpoints of my input images.

yuedajiong commented 9 months ago

I did not edit images.bin and cameras.bin, or .txt files.

But I can tell you how to use your own camera path:

1) define these three parameters: azimuth, elevation, distance.
2) use a funtion convert to R,T: (R, t) = LookAt(azimuth, elevation, distance) (eg. in pytorch3d, or yourself') 3) CameraRt_to_WorldView(R, t) and FullProject(WorldView, nearZ,farZ, fovX,fovY) 4) use the output of step#3 as GS input.

all code in : https://github.com/graphdeco-inria/gaussian-splatting/issues/350

gaetan-landreau commented 9 months ago

I have found that images.bin and cameras.bin (COLMAP type) are extrinsics and intrinsics of cameras respectively. And we can use read_extrinsics_binary and read_intrinsics_binary to read them. By default, cameras.bin(intrinsics) is a dict just like:
id:  1
model:  PINHOLE
width:  1912
height:  1075
params:  [ 994.91473032 1035.45535543  956.          537.5       ]
By default, images.bin(intrinsics) is a list whose dict quantity is as same as the quantity of input images, a dict just like:
id:  1
qvec:  [ 9.87182098e-01  4.67789801e-04 -1.59166096e-01 -1.17234620e-02]
tvec:  [ 0.71418639 -0.12678332  2.64190678]
camera_id:  1
name:  frame_0.png
xys:  [[1.00696456e+03 4.94545990e-02]
 [1.02521076e+03 3.03924482e-01]
 [1.14496957e+03 6.49794588e-01]
 ...
 [7.69126962e+02 5.17357767e+02]
 [1.19740032e+03 2.12154789e+02]
 [1.36614270e+03 6.43028234e+02]]
# xys.shape: (8807, 2)
point3D_ids:  [-1 -1 -1 ... -1 -1 -1]
# len(point3D_ids): 8807
#
If I provide a specific path by myself, I have to change images.bin. But I don't know how to edit xys and point3D_ids when the viewpoint is not in the viewpoints of my input images.

If your model is already trained, you don't need to care about the xysand point3D_ids, changing both tvec and qvec will allow you to render a view from a novel viewpoint :)

yuchenwengwyc commented 5 months ago

Hi, I have a similar question as you, I am trying to fuse two data generated from colmap(same scene, but different images), I am also wondering what is the variable point3d_ids? Is it used in render and SIBR? Because I fuse only the point.bin files from these two data which are generated from colmap, just interplate the points from one file into the other one.

leblond14u commented 5 months ago

I did not edit images.bin and cameras.bin, or .txt files.

But I can tell you how to use your own camera path:

1. define these three parameters:  azimuth, elevation, distance.

2. use a funtion convert to R,T:   (R, t) = LookAt(azimuth, elevation, distance)      (eg. in pytorch3d, or yourself')

3. CameraRt_to_WorldView(R, t)    and FullProject(WorldView, nearZ,farZ, fovX,fovY)

4. use the output of step#3 as GS input.

all code in : #350

Hi @yuedajiong ,

I'll be interested if you could give more details on this point.

I'm trying to use a ground truth path (Augmented ICL NUIM) to reconstruct a scene with 3DGS. As the transforms given with this dataset are from C2W and COLMAP is W2C I applied the transform given with the Nerf data-loading sequence to convert my data (your 3rd step). Also changed the OpenGL camera norm to the COLMAP one with the same conversion. However when rendering my 3DGS scene I can see that my renders are not aligned with my ground truth images.

Is there something I'm missing ? Do you know how to successfully use ground truth data with 3DGS ?

Thanks in advance, Best

yuedajiong commented 5 months ago

@leblond14u

https://github.com/WU-CVGL/MVControl-threestudio/blob/main/app_stage1.py Line: 55,156,208

you can modifiy it, extend 4 views to more.

if the the inputs are azimuth, elevation and distance, istead of R and t , it should be easily understand by human.

some samples camera_path.py.txt

is this enough? if helpful, say thanks to me. :-)

leblond14u commented 5 months ago

Thanks @yuedajiong I'll check this now :)

I discovered something weird with the guassianRenderer in the 3DGS implementation. I was trying to use the above mentioned dataset and discovered that even using the right transforms (C2W to W2C and change from OpenGL to COLMAP camera), the render does not match the ground truth image.

I guess there's something to do with the origin of the coordinate system or scale of the map. Not sure yet what is causing this issue.

yuedajiong commented 5 months ago

sometimes, official GS has some distant float-points. need more logics to clean these points.

sidsunny commented 4 months ago

Hi @yuedajiong I see that "xys" and "point3D_ids" are not being used in the code anywhere. Do you know what these should be used for? I only have camera information for "K" multi-view images and would like to create a GS from these.

yuedajiong commented 4 months ago

Hi @sidsunny: Other implementations have too many garbages. Try my code, includes: clear colmap code clearest GS code, include CUDA code even has background-remove code a simple viewer, very lightweight ...

You can see all attributes using: not all attributes are used by GS.

If my code is better, say thanks to me: hahahahaha.

0003.ok.clean-bg.zip

sidsunny commented 4 months ago

@yuedajiong Thank you for your code! The problem I am facing is that COLMAP is not able to find good feature matches and so it cannot generate camera information. I plan to use the default camera provided with the images but I am not sure if those cameras are pinhole. Also, I do not have the point cloud required for the images so I need to generate that too.

yuedajiong commented 4 months ago

your question/problem:
can not compute the camera pose. my answer/suggestion:

if images are fixed, you can adjust colmap parameters. firstly, please make sure you have used colmap correctly. (main solution)
if fail, you can use other RL based pose esimation, such as dust3d or others.
in fact, the final perfect algorithm are camera-free(not auto), waiting me, waiting other researchers, and you can try.

AND:

your question/problem: you have not point cloud. my answer/suggestion: the initilized point cloud (from colmap), is just better-to-have, not necessary. you can directly randomize it with uniform dist.

graphdeco-inria / gaussian-splatting

How to use render.py to generate a set of rendered images in provided path by myself? #479