MishaLaskin / rad

RAD: Reinforcement Learning with Augmented Data
400 stars 71 forks source link

Cannot reproduce #17

Open recordmp3 opened 3 years ago

recordmp3 commented 3 years ago

Dear author,

Could you please provide with a complete command for RAD on DMC? (for example for "CartPole-SwingUp" ?)

I cannot reproduce results of CartPole-SwingUp in the paper by running the command in script/run.sh.

It seems the command in run.sh is not completely the same as hyperparameters listed in the paper (like batch-size is 512 in the paper but 128 in run.sh). And I changed them but still cannot get the same result of the paper.

I'll list the command I run for these experiments:

  1. SAC-pixel

    It should attain reward≈200 after 100k env step (and 12.5k policy step since action_repeat = 8) but what I got is bigger (like 250 or 300)

    CUDA_VISIBLE_DEVICES=0 python train.py \ --domain_name cartpole \ --task_name swingup \ --encoder_type pixel --work_dir ./tmp \ --action_repeat 8 --num_eval_episodes 10 \ --pre_transform_image_size 100 --image_size 84 \ --agent rad_sac --frame_stack 3 --data_augs no_aug \ --seed 234567 --critic_lr 1e-3 --actor_lr 1e-3 --eval_freq 2500 --batch_size 512 --num_train_steps 12500 --latent_dim 50

  2. RAD(translate)

    It should attain reward≈828 after 100k env step (12.5k policy step) but what I got is much smaller (around 50)

    CUDA_VISIBLE_DEVICES=0 python train.py \ --domain_name cartpole \ --task_name swingup \ --encoder_type pixel --work_dir ./tmp \ --action_repeat 8 --num_eval_episodes 10 \ --pre_transform_image_size 100 --image_size 84 \ --agent rad_sac --frame_stack 3 --data_augs translate \ --seed 234567 --critic_lr 1e-3 --actor_lr 1e-3 --eval_freq 2500 --batch_size 512 --num_train_steps 12500 --latent_dim 50

Sincerely look forward to your reply!

TaoHuang13 commented 2 years ago

I think image_size should be 108 when doing translate.

longfeizhang617 commented 2 years ago

I think image_size should be 108 when doing translate.

while when I change the image_size to 108, the result still has a large gap comparing to the result in the paper. Have you reproduced the experiment using translate?