NVlabs / BundleSDF

[CVPR 2023] BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects
https://bundlesdf.github.io/
Other
1.05k stars 114 forks source link

Problems with running custom sequence #4

Closed Ale-Burzio closed 1 year ago

Ale-Burzio commented 1 year ago

Thanks for the great work!

I have tried BundleSDF with the following sequence from the BEHAVE dataset: https://mega.nz/file/sXUg0YBT#OVmRQHPhMvpr6FdNenef2VhrtinDZxUWj0bjzAN9VSQ

I have formatted everything as indicated in the readme, as well as rescaled the images to 480x640 resolution (the pipeline would otherwise crash) and corrected the intrinsics matrix. I tested the sequence on BundleTrack and it works fine there (although with poor results).

The problem seems to be in the generation of the pointcloud from depth, as the saved pointcloud in the results folder is almost empty (apart from a few sporadic points). The demo milk sequence instead works fine. What am I doing wrong?

(the pipeline was tested on a RTX 2080ti GPU if that can be relevant)

wenbowen123 commented 1 year ago

Can you increase the debug_level to >=3 to enable more verbose logging? https://github.com/NVlabs/BundleSDF/blob/master/run_custom.py#L199

Then you can visualize the ply files in the output dir, frame ID folders, whether the point cloud is reasonable.

Ale-Burzio commented 1 year ago

done, and the point cloud is not reasonable (again, very sparse points)

here some visualization of the first few frames, the mask is correct and the RGB image is segmented correctly, but depth is wrong (filtered depth is empty) https://mega.nz/file/lWtR0axI#YXC2Cv_6zWdcMlxZ8UVOOD-qUUL9iSFdoCdveKsSbaY

wenbowen123 commented 1 year ago

This does not look right. It seems like the depth format is not exactly the same as the milk to be read. Can you first make sure you can convert the depth to reasonable point cloud with your intrinsics? This is independent on BundleSDF.

Ale-Burzio commented 1 year ago

They look fine actually, although the one rescaled to 640x480 has a lot of noise (maybe because of interpolation?). but it's still very different from what I get in the pipeline

https://imgur.com/a/cC6WWxr

wenbowen123 commented 1 year ago

perhaps you can try sending the images before resizing. Also make sure your intrinsics are correct.

Ale-Burzio commented 1 year ago

The problem with images before resizing is that the program crashes when at some point it tries to reshape something into 640x480 resolution, but I was not able to track where that happens.

bkyCadida commented 1 year ago

I assumed the downsizing can be specified in run_custom.py l.40: reader = YcbineoatReader(video_dir=video_dir, shorter_side=480) either by setting the shorter_side or by providing the downscale parameter (default=1) to YcbineoatReader. @wenbowen123 Is that sufficient or must the image size / downscale factor also be set anywhere else?

wenbowen123 commented 1 year ago

@bkyCadida try shorter_side for now. Later we will add that as an option.

redgreenblue3 commented 1 year ago

I think I have come across an issue related to this: When scaling a different dataset's depth map to match the range of the milk example, a breaking error was resolved. It seems that the scale of the input depth map matters, and that values above a certain threshold get disregarded during point cloud extraction.

In detail:

I recently got the following error when running BundleSDF on the SM1 scene of the HO3D dataset:

[pcl::PLYWriter::writeASCII] Input point cloud has no data!
[pcl::KdTreeFLANN::setInputCloud] Cannot create a KDTree with an empty input cloud!
[pcl::PLYWriter::writeASCII] Input point cloud has no data!

After some digging, it seemed that the problem comes from https://github.com/NVlabs/BundleSDF/blob/1bb46a6b16f3a190922827cb031a57da14e90d28/BundleTrack/src/Frame.cpp#L162 which is called here https://github.com/NVlabs/BundleSDF/blob/1bb46a6b16f3a190922827cb031a57da14e90d28/bundlesdf.py#L406 From what I understand, this error in turn arises from the fact that the frame object's point cloud was constructed as empty during initialization, I am guessing this point cloud construction takes place in the following function call https://github.com/NVlabs/BundleSDF/blob/1bb46a6b16f3a190922827cb031a57da14e90d28/BundleTrack/src/Frame.cpp#L125

Oddly, the error does not occur when running BundleSDF with the provided milk dataset. But it persisted even after I made sure my dataset (SM1 scene in the HO3D dataset) matched the milk dataset in every conceivable way to make sure the frame initialization and hence point cloud extraction would be correct (same resolution, same np.array dtype, etc.). I verified that they were effectively the same by comparing the inputs to this method call https://github.com/NVlabs/BundleSDF/blob/1bb46a6b16f3a190922827cb031a57da14e90d28/bundlesdf.py#L538

I noticed that the only remaining difference were the value ranges of the depth map at this point. For the milk dataset, depth values ranged, apart from zero values, from ~0.4 to ~0.6. For my dataset, they ranged from ~2.5 to ~3.5. After I preprocessed my (HO3D-SM1) data to ensure the depth values also fell into the same range (by simply dividing by 6), the error was resolved and the output mesh was reasonable, so mission accomplished.

@wenbowen123 Is there some piece of code in the point cloud extraction that somehow filters out / ignores pixels with certain depth values, perhaps also in a hardcoded way? If this is the case this would of course be an important thing to resolve. It probably breaks the code for a lot of other datasets and users, and will likely be hard to both identify and resolve for most users, especially due to the way it is nested in the C++ portion of the code.

Ale-Burzio commented 1 year ago

@bkyCadida try shorter_side for now. Later we will add that as an option.

Changing this parameter solved the resizing issue and can now use the original size images, however I now incur in the same exact issue as @redgreenblue3.

I am trying to run the program on a sequence from the BEHAVE dataset, did you have to do anything in particular use that dataset @wenbowen123?

here a short part of the sequence with original size: https://mega.nz/file/APc0VSLL#SvFpwX6gDQIXHEm0vrZEkTtMwRdxKLU2q_ODa73VNjs

monajalal commented 1 year ago

@Ale-Burzio when I try your folder, I get this error. Have you been able to run it? It seems you need some pre-processing before using this folder

(py38) root@bundlesdf:/home/azureuser/BundleSDF# python run_custom.py --mode run_video --video_dir /home/azureuser/BundleSDF/stool_short/ --out_folder /home/azureuser/BundleSDF/stool_short/out --use_segmenter 1 --use_gui 0 --debug_level 2
[2023-07-10 08:13:32.295] [warning] [Bundler.cpp:49] Connected to nerf_port 9999
[2023-07-10 08:13:32.295] [warning] [FeatureManager.cpp:2084] Connected to port 5555
default_cfg {'backbone_type': 'ResNetFPN', 'resolution': (8, 2), 'fine_window_size': 5, 'fine_concat_coarse_feat': True, 'resnetfpn': {'initial_dim': 128, 'block_dims': [128, 196, 256]}, 'coarse': {'d_model': 256, 'd_ffn': 256, 'nhead': 8, 'layer_names': ['self', 'cross', 'self', 'cross', 'self', 'cross', 'self', 'cross'], 'attention': 'linear', 'temp_bug_fix': False}, 'match_coarse': {'thr': 0.2, 'border_rm': 2, 'match_type': 'dual_softmax', 'dsmax_temperature': 0.1, 'skh_iters': 3, 'skh_init_bin_score': 1.0, 'skh_prefilter': True, 'train_coarse_percent': 0.4, 'train_pad_num_gt_min': 200}, 'fine': {'d_model': 128, 'd_ffn': 128, 'nhead': 8, 'layer_names': ['self', 'cross'], 'attention': 'linear'}}
video dir is:  /home/azureuser/BundleSDF/stool_short/
etc etc etc
[bundlesdf.py] percentile denoise start
Traceback (most recent call last):
  File "run_custom.py", line 214, in <module>
    run_one_video(video_dir=args.video_dir, out_folder=args.out_folder, use_segmenter=args.use_segmenter, use_gui=args.use_gui)
  File "run_custom.py", line 114, in run_one_video
    tracker.run(color, depth, K, id_str, mask=mask, occ_mask=None, pose_in_model=pose_in_model)
  File "/home/azureuser/BundleSDF/bundlesdf.py", line 535, in run
    valid = (depth>=0.1) & (mask>0)
ValueError: operands could not be broadcast together with shapes (480,640) (1536,2048) 
monajalal commented 1 year ago

2.5

@redgreenblue3 The range of my depth values are 0.001 and 0.255 so simply dividing one to match that of either upper or lower case range of mik doesn't work. For example, if I try the match the upper case, I will have: 0.255/x = 6/10 --> x =(10x0.255)/6

now if I divide 0.001 by that: 0.001 /((10x0.255)/6) =0.002 which doesn't match the lower case. so what should I do?

wenbowen123 commented 1 year ago

For BEHAVE data, have you tried using a different config file BundleTrack/config_behave.yml?

There is a depth clipping set here image

monajalal commented 1 year ago

@wenbowen123 for our own custom datasets (not BEHAVE or the one you have), how should we create the config file?

Ale-Burzio commented 1 year ago

@Ale-Burzio when I try your folder, I get this error. Have you been able to run it? It seems you need some pre-processing before using this folder

(py38) root@bundlesdf:/home/azureuser/BundleSDF# python run_custom.py --mode run_video --video_dir /home/azureuser/BundleSDF/stool_short/ --out_folder /home/azureuser/BundleSDF/stool_short/out --use_segmenter 1 --use_gui 0 --debug_level 2
[2023-07-10 08:13:32.295] [warning] [Bundler.cpp:49] Connected to nerf_port 9999
[2023-07-10 08:13:32.295] [warning] [FeatureManager.cpp:2084] Connected to port 5555
default_cfg {'backbone_type': 'ResNetFPN', 'resolution': (8, 2), 'fine_window_size': 5, 'fine_concat_coarse_feat': True, 'resnetfpn': {'initial_dim': 128, 'block_dims': [128, 196, 256]}, 'coarse': {'d_model': 256, 'd_ffn': 256, 'nhead': 8, 'layer_names': ['self', 'cross', 'self', 'cross', 'self', 'cross', 'self', 'cross'], 'attention': 'linear', 'temp_bug_fix': False}, 'match_coarse': {'thr': 0.2, 'border_rm': 2, 'match_type': 'dual_softmax', 'dsmax_temperature': 0.1, 'skh_iters': 3, 'skh_init_bin_score': 1.0, 'skh_prefilter': True, 'train_coarse_percent': 0.4, 'train_pad_num_gt_min': 200}, 'fine': {'d_model': 128, 'd_ffn': 128, 'nhead': 8, 'layer_names': ['self', 'cross'], 'attention': 'linear'}}
video dir is:  /home/azureuser/BundleSDF/stool_short/
etc etc etc
[bundlesdf.py] percentile denoise start
Traceback (most recent call last):
  File "run_custom.py", line 214, in <module>
    run_one_video(video_dir=args.video_dir, out_folder=args.out_folder, use_segmenter=args.use_segmenter, use_gui=args.use_gui)
  File "run_custom.py", line 114, in run_one_video
    tracker.run(color, depth, K, id_str, mask=mask, occ_mask=None, pose_in_model=pose_in_model)
  File "/home/azureuser/BundleSDF/bundlesdf.py", line 535, in run
    valid = (depth>=0.1) & (mask>0)
ValueError: operands could not be broadcast together with shapes (480,640) (1536,2048) 

I needed to change the shorter_side value in run_custom.py to match the resolution of my images

Ale-Burzio commented 1 year ago

@wenbowen123 @redgreenblue3 the issue was indeed in the z_far threshold, maybe I would clarify this + the shorter_side resolution fix in the README as it is not so clear, or at least point to where the config files are / add argument to specify custom parameters file

Thanks for the help!