GuanxingLu / ManiGaussian

[ECCV 2024] ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation
MIT License
152 stars 6 forks source link

Question about rlbench depth data #19

Open kjeiun opened 1 month ago

kjeiun commented 1 month ago
image

I found that the generated depth data from gen_demonstration was quite different from other depth data. Do you think it is a intended result?

cheng052 commented 1 month ago

This seems ok. As the depth data is saved as a single channel image, such visualization results seem to be display issues. The converted point cloud seems good in Meshlab.

image

Btw have you ever reproduced the results reported in the paper? It seems hard to reproduce the performance shown in the paper.

kjeiun commented 1 month ago

Thanks for the reply !

I haven't tried the evaluation since there was a problem when trying to run evaluation code inside of the docker , but the reconstructed image shown in wandb wasn't good.

Are you meaning the success rate?

cheng052 commented 1 month ago
Yes, the success rate. Here are some results I got (The last 4 rows are reproduced on my local machine, all experiments are done using 2 RTX3090 GPUs). Method lr_scheduler geo dyna sem step comment AVG extra
GNFactor FALSE 1 0 1 100000 paper report 31.7
ManiGaussian TRUE 1 0 0 100000 paper report 39.2
ManiGaussian TRUE 1 1 0 100000 paper report 41.6
ManiGaussian TRUE 1 1 1 100000 paper report 44.8
GNFactor FALSE 1 0 1 100000 released ckpt 38.4
GNFactor FALSE 1 0 1 100000 released csv 36.13
ManiGaussian Unknown 1 0 0 100000 released csv 41.07
GNFactor FALSE 1 0 1 100000 Local Reproduction 38
ManiGaussian FALSE 1 0 0 100000 Local Reproduction 32.8 media_images_eval_recon_img_100000_3633c27ad4ea2549c5c1
ManiGaussian FALSE 1 1 0 100000 Local Reproduction 34 mg_geo_dyna
ManiGaussian TRUE 1 1 1 100000 Local Reproduction 29.6 mg_geo_dyna_sem

The GNFactor performance can match the paper, but ManiGaussian fails. As seen in the table and the reconstructed image, I think there are still some hidden bugs in the released code. @GuanxingLu Any suggestion on reproducing the desired performance?

GuanxingLu commented 1 month ago

Sorry for the late reply. The reconstruction results seem normal, as the action loss plays a main role in the optimization, the reconstruction should seem relatively poor. The reconstruction quality does not affect the action prediction because we decode the robot action from the volumetric representation rather than the Gaussians (in the test phase, the Gaussian regressor and deformation field are not used).

However, though the training and evaluation processes still fluctuate even with the seed fixed, the provided scripts should reproduce the results without problem... thanks for your detailed experimental logs, I think there are several things to try: 1. evaluate the 'best' checkpoint rather than 'last' (maybe 90000 steps), sometimes the performance of the 'last' checkpoint drops slightly. 2. just evaluate the checkpoint again.

GuanxingLu commented 1 month ago

This seems ok. As the depth data is saved as a single channel image, such visualization results seem to be display issues. The converted point cloud seems good in Meshlab. image

Btw have you ever reproduced the results reported in the paper? It seems hard to reproduce the performance shown in the paper.

Thanks for your answer. Yes the depth image is quantized by rlbench, so the visualization may seem weird. See https://github.com/GuanxingLu/ManiGaussian/blob/main/third_party/RLBench/rlbench/backend/utils.py for more details.