Can't reproduce the MVP performance on FrankaPick

Yingdong-Hu commented 2 years ago

Hi, thanks for the great project!

I can reproduce the performance of the oracle state model, which shows good stability. However, when I try to reimplement the MVP results on FrankaPick, I have some difficulties. I use the pre-trained MAE you provided. I use 5 seeds: 333, 444, 555, 666, 777. It seems that IsaacGym Preview 3 can only be downloaded from the official website now. I use wandb to log the results. The success rates are shown in the figure below. The success rate has been zero in two out of five runs.

mvp_FrankaPick

The final mean success rate is around 53, shown below, which is far behind the performance (about 85) reported in the paper.

mvp_FrankaPick_mean

How can I obtain the results reported in the paper？Is it normal that the success rate on the task is often 0? It seems that MVP exhibits good stability across different seeds (low variance, Figure 5 in the paper), what seeds do you use?

mohitsharma0690 commented 2 years ago

Hi, I have had similar issues with FrankaPickPixels task, I tried 3 seeds (91, 1729, 625) and 2 of them failed while one got 10% success rate (after 2000 epochs). I am also using "IsaacGym Preview 2" as suggested in the Readme. On the other hand, the performance on FrankaReachPixels seems to be good --- most of the seeds worked in that setting.

Also, similarly, FrankaPickObjectPixels also did not work for my case (which is clearly harder than FrankaReachPixels).

Is there some specific hyperparameter that we should be tuning? I tried a few different learning rates, but to no avail.

ir413 commented 2 years ago

Hi @Alxead and @mohitsharma0690, thanks for trying out the code and taking the time to share your results!

All of the experiments in the paper were run with Isaac Gym Preview 2 and for seed=0,1,2,3,4. A few other groups have told us that they reproduced the results with Isaac Gym Preview 3 as well.

As an additional sanity check, following your post we created a fresh environment with Isaac Gym Preview 3 and re-run FrankaPickPixels (env here). The resulting curves seem consistent with the ones from the paper. Please see below:

Screenshot from 2022-04-07 13-34-45

All of these runs use the default config from this repo:

python tools/train.py task=FrankaPickPixels logdir=exps/seed-N train.seed=N

Hope this helps. Please let us know if you have any further questions!

Yingdong-Hu commented 2 years ago

Hi@ @ir413 I create a new conda environment follow your configuration and try your seed: 0, 1, 2, 3, 4. In one of the five runs, the success rate was zero. If you try more seeds, do you get 0 success rate? Do you use the same seeds (0, 1, 2, 3, 4) for all the experiments in the paper?

ir413 commented 2 years ago

Hi @Alxead, I suspect there might be some nondeterminism due to machine and system setup even when the seed is fixed. Following your post, we run the remaining seeds through seed=10. The curves seem consistent. Please see below:

Screenshot from 2022-04-11 00-53-21

We use the same seeds (0,1,2,3,4) for all of the experiments in the paper. In addition, we sweep the learning rate and perform the same number of runs for each method in the paper (see Section 4 for more details).

jasonseu commented 2 years ago

Hi, thanks for the great project!

I can reproduce the performance of the oracle state model, which shows good stability. However, when I try to reimplement the MVP results on FrankaPick, I have some difficulties. I use the pre-trained MAE you provided. I use 5 seeds: 333, 444, 555, 666, 777. It seems that IsaacGym Preview 3 can only be downloaded from the official website now. I use wandb to log the results. The success rates are shown in the figure below. The success rate has been zero in two out of five runs.

The final mean success rate is around 53, shown below, which is far behind the performance (about 85) reported in the paper.

How can I obtain the results reported in the paper？Is it normal that the success rate on the task is often 0? It seems that MVP exhibits good stability across different seeds (low variance, Figure 5 in the paper), what seeds do you use?

Hi, have you modified the environment config file of FrankaPickPixels for adapting to the IsaacGym Preview 3 ? It fails to create the environment of FrankaPickPixels directly in my case, while successes for FrankaPick environment. I am new to robot control field and didn't find a tutorial to create environment with IsaacGym. Hope for help.

hekj commented 2 years ago

Hi, thanks for the great project!

I can reproduce the performance of the oracle state model, which shows good stability. However, when I try to reimplement the MVP results on FrankaPick, I have some difficulties. I use the pre-trained MAE you provided. I use 5 seeds: 333, 444, 555, 666, 777. It seems that IsaacGym Preview 3 can only be downloaded from the official website now. I use wandb to log the results. The success rates are shown in the figure below. The success rate has been zero in two out of five runs.

The final mean success rate is around 53, shown below, which is far behind the performance (about 85) reported in the paper.

How can I obtain the results reported in the paper？Is it normal that the success rate on the task is often 0? It seems that MVP exhibits good stability across different seeds (low variance, Figure 5 in the paper), what seeds do you use?

Hi, Congratulations on a successful run on the new IsaacGym version. I have tried that but failed when I base the code on the environment in INSTALL.md. Can you share some details about your environment (ubuntu or debian, PyTorch, Cuda, cuDNN, NVIDIA driver version)? Thank you very much!

ir413 commented 2 years ago

Closing this since the original issue seems resolved. @jasonseu, @hekj: Sorry for the late response. Your questions seem different from the original issue, could you please open new issues? (to keep discussions focused; I will respond there)

ir413 / mvp

Can't reproduce the MVP performance on FrankaPick #4