Closed Ale-Burzio closed 1 year ago
Can you increase the debug_level to >=3 to enable more verbose logging? https://github.com/NVlabs/BundleSDF/blob/master/run_custom.py#L199
Then you can visualize the ply files in the output dir, frame ID folders, whether the point cloud is reasonable.
done, and the point cloud is not reasonable (again, very sparse points)
here some visualization of the first few frames, the mask is correct and the RGB image is segmented correctly, but depth is wrong (filtered depth is empty) https://mega.nz/file/lWtR0axI#YXC2Cv_6zWdcMlxZ8UVOOD-qUUL9iSFdoCdveKsSbaY
This does not look right. It seems like the depth format is not exactly the same as the milk to be read. Can you first make sure you can convert the depth to reasonable point cloud with your intrinsics? This is independent on BundleSDF.
They look fine actually, although the one rescaled to 640x480 has a lot of noise (maybe because of interpolation?). but it's still very different from what I get in the pipeline
perhaps you can try sending the images before resizing. Also make sure your intrinsics are correct.
The problem with images before resizing is that the program crashes when at some point it tries to reshape something into 640x480 resolution, but I was not able to track where that happens.
I assumed the downsizing can be specified in run_custom.py l.40:
reader = YcbineoatReader(video_dir=video_dir, shorter_side=480)
either by setting the shorter_side or by providing the downscale parameter (default=1) to YcbineoatReader.
@wenbowen123 Is that sufficient or must the image size / downscale factor also be set anywhere else?
@bkyCadida try shorter_side
for now. Later we will add that as an option.
I think I have come across an issue related to this: When scaling a different dataset's depth map to match the range of the milk example, a breaking error was resolved. It seems that the scale of the input depth map matters, and that values above a certain threshold get disregarded during point cloud extraction.
In detail:
I recently got the following error when running BundleSDF on the SM1 scene of the HO3D dataset:
[pcl::PLYWriter::writeASCII] Input point cloud has no data!
[pcl::KdTreeFLANN::setInputCloud] Cannot create a KDTree with an empty input cloud!
[pcl::PLYWriter::writeASCII] Input point cloud has no data!
After some digging, it seemed that the problem comes from https://github.com/NVlabs/BundleSDF/blob/1bb46a6b16f3a190922827cb031a57da14e90d28/BundleTrack/src/Frame.cpp#L162 which is called here https://github.com/NVlabs/BundleSDF/blob/1bb46a6b16f3a190922827cb031a57da14e90d28/bundlesdf.py#L406 From what I understand, this error in turn arises from the fact that the frame object's point cloud was constructed as empty during initialization, I am guessing this point cloud construction takes place in the following function call https://github.com/NVlabs/BundleSDF/blob/1bb46a6b16f3a190922827cb031a57da14e90d28/BundleTrack/src/Frame.cpp#L125
Oddly, the error does not occur when running BundleSDF with the provided milk dataset. But it persisted even after I made sure my dataset (SM1 scene in the HO3D dataset) matched the milk dataset in every conceivable way to make sure the frame initialization and hence point cloud extraction would be correct (same resolution, same np.array dtype, etc.). I verified that they were effectively the same by comparing the inputs to this method call https://github.com/NVlabs/BundleSDF/blob/1bb46a6b16f3a190922827cb031a57da14e90d28/bundlesdf.py#L538
I noticed that the only remaining difference were the value ranges of the depth map at this point. For the milk dataset, depth values ranged, apart from zero values, from ~0.4 to ~0.6. For my dataset, they ranged from ~2.5 to ~3.5. After I preprocessed my (HO3D-SM1) data to ensure the depth values also fell into the same range (by simply dividing by 6), the error was resolved and the output mesh was reasonable, so mission accomplished.
@wenbowen123 Is there some piece of code in the point cloud extraction that somehow filters out / ignores pixels with certain depth values, perhaps also in a hardcoded way? If this is the case this would of course be an important thing to resolve. It probably breaks the code for a lot of other datasets and users, and will likely be hard to both identify and resolve for most users, especially due to the way it is nested in the C++ portion of the code.
@bkyCadida try
shorter_side
for now. Later we will add that as an option.
Changing this parameter solved the resizing issue and can now use the original size images, however I now incur in the same exact issue as @redgreenblue3.
I am trying to run the program on a sequence from the BEHAVE dataset, did you have to do anything in particular use that dataset @wenbowen123?
here a short part of the sequence with original size: https://mega.nz/file/APc0VSLL#SvFpwX6gDQIXHEm0vrZEkTtMwRdxKLU2q_ODa73VNjs
@Ale-Burzio when I try your folder, I get this error. Have you been able to run it? It seems you need some pre-processing before using this folder
(py38) root@bundlesdf:/home/azureuser/BundleSDF# python run_custom.py --mode run_video --video_dir /home/azureuser/BundleSDF/stool_short/ --out_folder /home/azureuser/BundleSDF/stool_short/out --use_segmenter 1 --use_gui 0 --debug_level 2
[2023-07-10 08:13:32.295] [warning] [Bundler.cpp:49] Connected to nerf_port 9999
[2023-07-10 08:13:32.295] [warning] [FeatureManager.cpp:2084] Connected to port 5555
default_cfg {'backbone_type': 'ResNetFPN', 'resolution': (8, 2), 'fine_window_size': 5, 'fine_concat_coarse_feat': True, 'resnetfpn': {'initial_dim': 128, 'block_dims': [128, 196, 256]}, 'coarse': {'d_model': 256, 'd_ffn': 256, 'nhead': 8, 'layer_names': ['self', 'cross', 'self', 'cross', 'self', 'cross', 'self', 'cross'], 'attention': 'linear', 'temp_bug_fix': False}, 'match_coarse': {'thr': 0.2, 'border_rm': 2, 'match_type': 'dual_softmax', 'dsmax_temperature': 0.1, 'skh_iters': 3, 'skh_init_bin_score': 1.0, 'skh_prefilter': True, 'train_coarse_percent': 0.4, 'train_pad_num_gt_min': 200}, 'fine': {'d_model': 128, 'd_ffn': 128, 'nhead': 8, 'layer_names': ['self', 'cross'], 'attention': 'linear'}}
video dir is: /home/azureuser/BundleSDF/stool_short/
etc etc etc
[bundlesdf.py] percentile denoise start
Traceback (most recent call last):
File "run_custom.py", line 214, in <module>
run_one_video(video_dir=args.video_dir, out_folder=args.out_folder, use_segmenter=args.use_segmenter, use_gui=args.use_gui)
File "run_custom.py", line 114, in run_one_video
tracker.run(color, depth, K, id_str, mask=mask, occ_mask=None, pose_in_model=pose_in_model)
File "/home/azureuser/BundleSDF/bundlesdf.py", line 535, in run
valid = (depth>=0.1) & (mask>0)
ValueError: operands could not be broadcast together with shapes (480,640) (1536,2048)
2.5
@redgreenblue3 The range of my depth values are 0.001 and 0.255 so simply dividing one to match that of either upper or lower case range of mik doesn't work. For example, if I try the match the upper case, I will have: 0.255/x = 6/10 --> x =(10x0.255)/6
now if I divide 0.001 by that: 0.001 /((10x0.255)/6) =0.002 which doesn't match the lower case. so what should I do?
For BEHAVE data, have you tried using a different config file BundleTrack/config_behave.yml
?
There is a depth clipping set here
@wenbowen123 for our own custom datasets (not BEHAVE or the one you have), how should we create the config file?
@Ale-Burzio when I try your folder, I get this error. Have you been able to run it? It seems you need some pre-processing before using this folder
(py38) root@bundlesdf:/home/azureuser/BundleSDF# python run_custom.py --mode run_video --video_dir /home/azureuser/BundleSDF/stool_short/ --out_folder /home/azureuser/BundleSDF/stool_short/out --use_segmenter 1 --use_gui 0 --debug_level 2 [2023-07-10 08:13:32.295] [warning] [Bundler.cpp:49] Connected to nerf_port 9999 [2023-07-10 08:13:32.295] [warning] [FeatureManager.cpp:2084] Connected to port 5555 default_cfg {'backbone_type': 'ResNetFPN', 'resolution': (8, 2), 'fine_window_size': 5, 'fine_concat_coarse_feat': True, 'resnetfpn': {'initial_dim': 128, 'block_dims': [128, 196, 256]}, 'coarse': {'d_model': 256, 'd_ffn': 256, 'nhead': 8, 'layer_names': ['self', 'cross', 'self', 'cross', 'self', 'cross', 'self', 'cross'], 'attention': 'linear', 'temp_bug_fix': False}, 'match_coarse': {'thr': 0.2, 'border_rm': 2, 'match_type': 'dual_softmax', 'dsmax_temperature': 0.1, 'skh_iters': 3, 'skh_init_bin_score': 1.0, 'skh_prefilter': True, 'train_coarse_percent': 0.4, 'train_pad_num_gt_min': 200}, 'fine': {'d_model': 128, 'd_ffn': 128, 'nhead': 8, 'layer_names': ['self', 'cross'], 'attention': 'linear'}} video dir is: /home/azureuser/BundleSDF/stool_short/ etc etc etc [bundlesdf.py] percentile denoise start Traceback (most recent call last): File "run_custom.py", line 214, in <module> run_one_video(video_dir=args.video_dir, out_folder=args.out_folder, use_segmenter=args.use_segmenter, use_gui=args.use_gui) File "run_custom.py", line 114, in run_one_video tracker.run(color, depth, K, id_str, mask=mask, occ_mask=None, pose_in_model=pose_in_model) File "/home/azureuser/BundleSDF/bundlesdf.py", line 535, in run valid = (depth>=0.1) & (mask>0) ValueError: operands could not be broadcast together with shapes (480,640) (1536,2048)
I needed to change the shorter_side value in run_custom.py to match the resolution of my images
@wenbowen123 @redgreenblue3 the issue was indeed in the z_far threshold, maybe I would clarify this + the shorter_side
resolution fix in the README as it is not so clear, or at least point to where the config files are / add argument to specify custom parameters file
Thanks for the help!
Thanks for the great work!
I have tried BundleSDF with the following sequence from the BEHAVE dataset: https://mega.nz/file/sXUg0YBT#OVmRQHPhMvpr6FdNenef2VhrtinDZxUWj0bjzAN9VSQ
I have formatted everything as indicated in the readme, as well as rescaled the images to 480x640 resolution (the pipeline would otherwise crash) and corrected the intrinsics matrix. I tested the sequence on BundleTrack and it works fine there (although with poor results).
The problem seems to be in the generation of the pointcloud from depth, as the saved pointcloud in the results folder is almost empty (apart from a few sporadic points). The demo milk sequence instead works fine. What am I doing wrong?
(the pipeline was tested on a RTX 2080ti GPU if that can be relevant)