Add Nerfcapture (app output) as an export format

SpectacularAI / sdk-examples

Spectacular AI SDK examples

Apache License 2.0

202 stars 35 forks source link

Add Nerfcapture (app output) as an export format #104

Open oseiskar opened 9 months ago

oseiskar commented 9 months ago

Also quick-fix an issue with aligned depth map resolution (oversized images).

Can be run on Spectacular Rec output as

python replay_to_nerf.py /PATH/TO/INPUT/spectacular-rec_XYZ \
    --format=nerfcapture --fast --image_format=png --device_preset=ios-tof --key_frame_distance=0.0001 \
    /PATH/TO/OUTPUT/FOLDER/nerfcapture-XYZ

to produce output in the same format as the Nerfcapture app (more or less the same as Instant NGP input?) as a work-around to this https://github.com/jc211/NeRFCapture/issues/10#issuecomment-1701908651 . This format seems to also work as an input to SplaTAM (edit: but the depth scale is probably still wrong)

dlazares commented 7 months ago

@oseiskar have you run this on the SplatTAM repo to see if depth scale is right or wrong? any plans to add the depth scale in this PR or follow-up?

oseiskar commented 7 months ago

This seemed to work on a certain early version of the SplaTAM code, but the depth scale here does not likely match what SplaTAM assumed from Nerfcapture (note that purely RGB-D based methods like SplaTAM may work nevertheless, the scale of the reconstruction is then just wrong). In that version of the SplaTAM code, it was also possible to set the depth scale in SplaTAM configuration files. The depth scale produced by the Spectacular Rec app is 0.001 (depth in millimeters) and it should be possible to configure SplaTAM to use that scale.

dlazares commented 7 months ago

@oseiskar i have it mostly working but yeah seems the depth scale is off. what would be the right value? I'm confused as hell by the SplatTam & NerfCapture setup currently. What would be the right value here?

Linking this for the explanation, even thought I didn't quite get it... https://github.com/spla-tam/SplaTAM/issues/7

dlazares commented 7 months ago

I got my best results experimentally with a png_depth_scale of "1000" but that doesn't really quite make sense to me because the lidar range is supposed to be 5m and I'm logging values before preprocess_depth as "MAX DEPTH 16327" so with that depth scale, it would return 16.327 meters?

oseiskar commented 7 months ago

I got my best results experimentally with a png_depth_scale of "1000" but that doesn't really quite make sense to me because the lidar range is supposed to be 5m and I'm logging values before preprocess_depth as "MAX DEPTH 16327" so with that depth scale, it would return 16.327 meters?

That sounds expected. Note that even though the range of the physical sensor is reportedly 5m, this does not mean that the depth map could not contain larger values. Our iPhone tests also regularl show values between 10 and 20m.

The true resolution of the iPhone "LiDAR" sensor is something like 25 x 25 pixels, and most of the depth map is constructed by a real-time image processing algorithm that segments the RGB image, combines it with the depth data and fills in the gaps, with varying levels of success.

In our quick tests, the SplatTAM algorithm worked fine for some recordings, but did not seem particularly robust or accurate in general.