rail-berkeley / rlkit

Collection of reinforcement learning algorithms
MIT License
2.45k stars 550 forks source link

Visualizing the Training and Running the Trained Policy using RIG #121

Open amir-ramezani-ai opened 4 years ago

amir-ramezani-ai commented 4 years ago

Thanks for your repository.

I am trying to train RIG using the following command: ~/rlkit-0.1.2$ python3.6 examples/rig/pusher/rig.py the training proceeds, but can I visualize it as well?

and trying to run the trained policy using the following command: ~/rlkit-0.1.2$ python3.6 scripts/sim_goal_conditioned_policy.py /home/caias18/rlkit-0.1.2/data/09-08-rlkit-pusher-rig-example/09-08-rlkit-pusher-rig-example_2020_09_08_13_43_34_0000--s-47482/params.pkl however, at the run time, the mujoco window is black (please check the following image): https://drive.google.com/file/d/1bE7fNQn6xS2xJcbdqdG75EUgSNJ2hWEf/view?usp=sharing

Appreciate your guide.

vitchyr commented 4 years ago

Can you try passing the --enable_render flag to the script?

amir-ramezani-ai commented 4 years ago

Thanks for the reply.

It doesn't have any effect on both the rig.py and sim_goal_conditioned.py.

for rig.py there is a show=True which shows the VAE.

I am running the codes on Ubuntu 18, could this be the issue?

amir-ramezani-ai commented 4 years ago

I have a similar issue in Ubuntu 16 as well.

amir-ramezani-ai commented 4 years ago

In Ubuntu16 when I am trying to sim policy I get the following error:

~/rlkit-0.1.2$ python3.6 scripts/sim_goal_conditioned_policy.py /home/caias2/rlkit-0.1.2/data/09-10-rlkit-pusher-rig-example/09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092/params.pkl 
pygame 1.9.6
Hello from the pygame community. https://www.pygame.org/contribute.html
Traceback (most recent call last):
  File "scripts/sim_goal_conditioned_policy.py", line 62, in <module>
  File "scripts/sim_goal_conditioned_policy.py", line 14, in simulate_policy
    policy = data['policy']
TypeError: 'ConvVAE' object is not subscriptable

gym version: '0.10.5' mujoco version: 1.5 mujoco path: ~/.mujuco/

and for mujoco I can run the simulate file and drag and drop the humanoid (so I guess the license is fine)

is there any other package I need to check?

the output of running pusher/rig.py is like:

bug/MSE improvement over random Min       -0.00426958
2020-09-10 14:42:18.153028 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] debug/MSE of random decoding Mean            0.0180728
2020-09-10 14:42:18.153082 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] debug/MSE of random decoding Std             0.0715821
2020-09-10 14:42:18.153135 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] debug/MSE of random decoding Max             0.999998
2020-09-10 14:42:18.153189 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] debug/MSE of random decoding Min             0
2020-09-10 14:42:18.153243 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] debug/MSE of reconstruction                  0.0121482
2020-09-10 14:42:18.153297 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] test/Log Prob                           -12456.6
2020-09-10 14:42:18.153351 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] test/KL                                     27.5673
2020-09-10 14:42:18.153406 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] test/loss                                12458.8
2020-09-10 14:42:18.153459 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] beta                                         0.078125
2020-09-10 14:42:18.153497 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] --------------------------------------  ---------------
2020-09-10 14:42:20.276780 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] --------------------------------------  ---------------
2020-09-10 14:42:20.276884 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] train/epoch                                296
2020-09-10 14:42:20.276951 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] train/Log Prob                          -12354.6
2020-09-10 14:42:20.277004 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] train/KL                                    34.2708
2020-09-10 14:42:20.277053 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] train/loss                               12357.2
2020-09-10 14:42:20.277090 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] debug/MSE improvement over random Mean       0.0136384
2020-09-10 14:42:20.277124 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] debug/MSE improvement over random Std        0.00382455
2020-09-10 14:42:20.277157 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] debug/MSE improvement over random Max        0.0223958
2020-09-10 14:42:20.277190 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] debug/MSE improvement over random Min        0.00449176
2020-09-10 14:42:20.277223 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] debug/MSE of random decoding Mean            0.017403
2020-09-10 14:42:20.277255 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] debug/MSE of random decoding Std             0.0822392
2020-09-10 14:42:20.277288 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] debug/MSE of random decoding Max             1
2020-09-10 14:42:20.277321 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] debug/MSE of random decoding Min             0
2020-09-10 14:42:20.277353 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] debug/MSE of reconstruction                  0.00376459
2020-09-10 14:42:20.277386 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] test/Log Prob                           -12465
2020-09-10 14:42:20.277427 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] test/KL                                     27.2501
2020-09-10 14:42:20.277461 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] test/loss                                12467.1
2020-09-10 14:42:20.277494 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] beta                                         0.078125
2020-09-10 14:42:20.277527 KST | [09-10-rlkit-pusher-rig-example_2020_09_10_14_32_15_0000--s-65092] --------------------------------------  ---------------
amir-ramezani-ai commented 3 years ago


in Ubuntu16 I face the following errors: RuntimeError: Window rendering not supported

after adding the following to .bashrc: export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libGLEW.so:/usr/lib/nvidia-384/libGL.so

I get this error: RuntimeError: Failed to initialize OpenGL

in Ubuntu18 I face the following error:

14 2.718439817428589
15 2.730947256088257
16 2.675100564956665
17 2.719597101211548
Traceback (most recent call last):
  File "examples/rig/pusher/rig.py", line 92, in <module>
    use_gpu=True,  # Turn on if you have a GPU
  File "/home/caias18/rlkit-0.1.2/rlkit/launchers/launcher_util.py", line 585, in run_experiment
  File "/home/caias18/rlkit-0.1.2/rlkit/launchers/launcher_util.py", line 166, in run_experiment_here
    return experiment_function(variant)
  File "/home/caias18/rlkit-0.1.2/rlkit/launchers/rig_experiments.py", line 41, in grill_her_td3_full_experiment
  File "/home/caias18/rlkit-0.1.2/rlkit/launchers/rig_experiments.py", line 363, in grill_her_td3_experiment
  File "/home/caias18/rlkit-0.1.2/rlkit/core/rl_algorithm.py", line 143, in train
  File "/home/caias18/rlkit-0.1.2/rlkit/core/rl_algorithm.py", line 173, in train_online
  File "/home/caias18/rlkit-0.1.2/rlkit/core/rl_algorithm.py", line 325, in _end_epoch
    post_epoch_func(self, epoch)
  File "/home/caias18/rlkit-0.1.2/rlkit/launchers/rig_experiments.py", line 537, in save_video
  File "/home/caias18/rlkit-0.1.2/rlkit/launchers/rig_experiments.py", line 558, in temporary_mode
    return_val = func(*args, **kwargs)
  File "/home/caias18/rlkit-0.1.2/rlkit/util/video.py", line 76, in dump_video
    (N, horizon + 1, H + 2 * pad_length, W + 2 * pad_length, num_channels))
ValueError: cannot reshape array of size 114307200 into shape (18,101,252,84,3)

any idea how to fix these?

abcdsaltfish commented 1 year ago

There are two "horizon + 1". You can simply change them to "horizon" to make rig.py run. I don't know why. It just works.