NVlabs / instant-ngp

Instant neural graphics primitives: lightning fast NeRF and more
https://nvlabs.github.io/instant-ngp
Other
15.75k stars 1.9k forks source link

depth loss question #939

Open SSground opened 1 year ago

SSground commented 1 year ago

Has anyone tried adding depth constraints to remove artifact(suspended matter)? The depth can be obtained from the sparse point cloud obtained by colmap.

JordanMakesMaps commented 1 year ago

Check out Point-Nerf, it seems like they're doing this (plus some).

SSground commented 1 year ago

Point-Nerf I wonder if there are plans inside ngp to add depth loss. In other words, does adding depth loss improve ngp?

dozeri83 commented 1 year ago

@SSground I added depth maps recently successfully. it looks like it did the job well. it improved psnr and remove floaters. on both synthetic and not synthetic data.

  1. If you have problems, I suggest that you look inside the code, specifically nerf_loader.cu. look for "enable_depth_loading".
  2. your depth maps need to be quantized to UINT16 png files and the quantization scale should be configured with "integer_depth_scale" in the json file. If this key is not in the json, the depth_maps will not be read (currently).
  3. Note that by default depth_supervision_lambda is set to 0 you need to change it to 0 < .
SSground commented 1 year ago

2. UINT16

thanks sir, Is it convenient for you to provide a set of data with depth so that I can debug

dozeri83 commented 1 year ago

unfortunately, I can not provide my own example. You can find a general example of depth maps in the nerf_syntetic dataset in here: https://drive.google.com/drive/folders/128yBriW1IG_3NJ5Rp7APSTZsJqdJdfc1

Each test data folder contains depth maps images. Their format is 4-channel 8bit images which contain less information, but the nerf_loader can still read.

It is very easy using OpenCV to convert float TIF images or exr files to UINt16 png.

SSground commented 1 year ago

unfortunately, I can not provide my own example. You can find a general example of depth maps in the nerf_syntetic dataset in here: https://drive.google.com/drive/folders/128yBriW1IG_3NJ5Rp7APSTZsJqdJdfc1

Each test data folder contains depth maps images. Their format is 4-channel 8bit images which contain less information, but the nerf_loader can still read.

It is very easy using OpenCV to convert float TIF images or exr files to UINt16 png.

thanks,sir

SSground commented 1 year ago

unfortunately, I can not provide my own example. You can find a general example of depth maps in the nerf_syntetic dataset in here: https://drive.google.com/drive/folders/128yBriW1IG_3NJ5Rp7APSTZsJqdJdfc1

Each test data folder contains depth maps images. Their format is 4-channel 8bit images which contain less information, but the nerf_loader can still read.

It is very easy using OpenCV to convert float TIF images or exr files to UINt16 png.

Hi,sir I have a question, if my depth map is sparse (part of the area has depth information, other areas have depth of 0), will the depth of the area of ​​0 affect the result? Because I want to use colmap's sparse point cloud to get a sparse depth map for supervision

dozeri83 commented 1 year ago

@SSground I think that the loader ignores 0 depths pixels.

Look at https://github.com/NVlabs/instant-ngp/blob/e1d33a42a4de0b24237685f2ebdc07bcef1ecae9/src/nerf_loader.cu#L96

SSground commented 1 year ago

@SSground I think that the loader ignores 0 depths pixels.

Look at

https://github.com/NVlabs/instant-ngp/blob/e1d33a42a4de0b24237685f2ebdc07bcef1ecae9/src/nerf_loader.cu#L96

Excuse me, how to align the pose calculated by colmap with the depth map collected by realsense.

chl2 commented 1 year ago

Does the depth information you are talking about mean that each image must have a depth image? Can I use colmap's 3d point, which is the overall depth information?

dozeri83 commented 1 year ago

@chl2 , yes , but notice that you can also assign depth images to a subset of the images.

I used colmap depth map images that were generated from the dense point cloud. I did not check the sparse point cloud, but theoretically, it may help.

1zgh commented 1 year ago

Hello, now I have depth data, how to generate depth map? The depth data is a float, ranging from 0 to 10. If converted to the type of uint16, this would cause loss of precision. I tried to multiply by 65535 to enlarge, but the depth map value obtained was much larger than the original, which was inaccurate. Can you help me with it? thank you

dozeri83 commented 1 year ago

@1zgh , multiplying with 65535 will cause overflow.

convert to int by interger_map = int(x*1000 +0.5)
since the depth is between 0 and 10 you will not overflow.

you will get a precision of 3 decimal places.

in the transforms.json file - add the field { "integer_depth_scale": 0.001 }

silver-obelisk commented 1 year ago

hi,I used a RGBD camare get some depth map(saved as UINt16 png ,0-65535),I found most of the pixel in my depth map,their depth value are 600-800.My camare said every value stand for 1mm,so this picture scene is 0.6-0.8 m in real world. But I set integer depth scale==0.001,it did not work. Could you tell me what integer depth scale should i set?

silver-obelisk commented 1 year ago

@dozeri83 Hi,I used a RGBD camare get some depth map(saved as UINt16 png ,0-65535),I found most of the pixel in my depth map,their depth value are 600-800.My camare said every value stand for 1mm,so this picture scene is 0.6-0.8 m in real world. But I set integer depth scale==0.001,it did not work. Could you tell me what integer depth scale should i set?

dozeri83 commented 1 year ago

@silver-obelisk - note that when the data is in the form of transform.json -

the transformation matrices (representing the camera to local world transformation) are in some local coordinate system. what you need is to find the local world to "real" world transformation. The local to real world transformation should be a similarity transformation ( scale transformation + by rotation + translation)

You need to devide the real depth by the scale of this transformation.

If you are using metashape for example , you should find this similarity transformation in the exported xml file.

jexiaong commented 1 year ago

@silver-obelisk My depth maps are in the same units (mm) did you figure out the math for calculating the integer depth scale??

@dozeri83 could you clarify more regarding how to determine this scale of transformation? Also, what should the units of the real depth be, and is real depth the maximum depth or avg depth of the scene?

dozeri83 commented 1 year ago

@silver-obelisk - If you got the depth maps from a different sensor (say Lidar) and the depth is in natural "world" units (say meters/mm) then you have to retrieve the scale to the nerf coordinates (the matrices in the json file) from the 3rd party software you are using to do the bundle adjustment .

For example - metashape write this scale factor in the exported xml file. For colmap - I think it is more complicated .. you need to use model aligner method to get this scale ( I never used it).

If you got the depth maps from the 3rd party software then there is a good chance the depth are already scaled to the local coordinates.

one more important thing - I dont know how you created the transform.json file, but If you scaled the camera positions when creating this file then you must accumulate this scale to the overall scale.

I would suggest reading more about similarity transformations ( scale + rotation + translations).

jexiaong commented 1 year ago

@dozeri83 Is this the scaling you're talking about below? https://github.com/NVlabs/instant-ngp/blob/99aed93bbe8c8e074a90ec6c56c616e4fe217a42/scripts/colmap2nerf.py#L396-L402 If my natural world units are in mm, do I just compute integer_depth_scale = 4.0 / avg_depth_in_mm or is there more to it? Where does 65535 come to play? Sorry I'm relatively new to this >.<

dozeri83 commented 1 year ago

@jexiaong - sort of ... it is more complicated than than. You certainly need to take into considirattion the multiplication by 4 when you scale but colmap internaly also scales the positions of the cameras. The way you calculate the scale is by calculating the similarity transform between the original location of the cameras in Eucleadian coordinates (say ENU and ECEF) to the new positions that colmap give you. (see https://colmap.github.io/faq.html Geo-registration)

regarding the 65535 .. after you fixed the scale of you depth images, instant-ngp known only loads uint16 images. so you need to quantize your float depth images to uint16 by multipying by some factor, you must give this factor to instant-ngp in the "integer_depth_scale" field.

overall it is somewhat complicated flow - you need to know basic scripting and how to extract the geolocation from you images and furthermore you must convert the geolocation from lat lon alt to ENU ...

kafei123456 commented 10 months ago

@silver-obelisk My depth maps are in the same units (mm) did you figure out the math for calculating the integer depth scale?? Hi , Have you solved the problem?

kafei123456 commented 10 months ago

@jexiaong - sort of ... it is more complicated than than. You certainly need to take into considirattion the multiplication by 4 when you scale but colmap internaly also scales the positions of the cameras. The way you calculate the scale is by calculating the similarity transform between the original location of the cameras in Eucleadian coordinates (say ENU and ECEF) to the new positions that colmap give you. (see https://colmap.github.io/faq.html Geo-registration)

regarding the 65535 .. after you fixed the scale of you depth images, instant-ngp known only loads uint16 images. so you need to quantize your float depth images to uint16 by multipying by some factor, you must give this factor to instant-ngp in the "integer_depth_scale" field.

overall it is somewhat complicated flow - you need to know basic scripting and how to extract the geolocation from you images and furthermore you must convert the geolocation from lat lon alt to ENU ...

Holle, I m trying to use Colmap to generation camera positions and get the depth map from the dense pointscloud that generated by Colmap . Theoretically , I can set "integer_depth_scale = 1.0", but it not work. Note that , there is a issue that the method I use to get depth map is Colmap official code (Look at: https://github.com/colmap/colmap/blob/main/scripts/python/read_write_dense.py). The depth_map shape is unequal to RGB image shape (For example: RGB image shape is 1440x1080, but depth_map is 1437x1076), so I have resized depth_map shape.