microsoft / IBAC-SNI

Code to reproduce the NeurIPS 2019 paper "Generalization in Reinforcement Learning with Selective Noise Injection and Information Bottleneck" by Maximilian Igl, Kamil Ciosek, Yingzhen Li, Sebastian Tschiatschek, Cheng Zhang, Sam Devlin and Katja Hofmann.
https://arxiv.org/abs/1910.12911
Other
52 stars 17 forks source link

The agent learn nothing in Grid-world environment #9

Closed liuqi8827 closed 3 years ago

liuqi8827 commented 3 years ago

Hi Thanks your great work. I'm facing a problem of the visualization performence of the agent in Grid-world environment. I train the agent successfully. However, I found the agent learn nothing in Grid-world environment by visualize.py. The agent chose it's action rundamly. It never success!

  1. I run python -m scripts.train --frames 100000000 --algo ppo --env MiniGrid-MultiRoom-N3r-v0 --model N3r-vib1e6 --save-interval 100 --tb --fullObs --model_type default2 --use_bottleneck --beta 0.000001 And I got the model parameter in the folder /storage/N3r-vib1e6
  2. I run python -m scripts.train --frames 100000000 --algo ppo --env MiniGrid-MultiRoom-N3r-v0 --model N3r-vibS1e6 --save-interval 100 --tb --fullObs --model_type default2 --use_bottleneck --beta 0.000001 --sni_type vib And I got the model parameter in the folder /storage/N3r-vibS1e6
  3. I run plots.py And I got the picture below: Figure_1
  4. We can find that the return in step 3 is increase monotonously. However, when I run visualize.py for N3r-vib1e6 and N3r-vibS1e6 respectively, I found the agent learn nothing in Grid-world environment. The agent chose it's action rundamly. It never success!

Can you give some suggestions?

Thanks a lot!

maximilianigl commented 3 years ago

Hey,

thank you for the detailed description! Just to check first, are you passing --fullObs to visualize.py as well?

liuqi8827 commented 3 years ago

Thanks for your quick reply

You are right. I did not pass --fullObs to visualize.py. When I run python3 -m scripts.visualize --env MiniGrid-MultiRoom-N3r-v0 --model N3r-vib1e6 --fullObs, it works well.

Thanks a lot!

maximilianigl commented 3 years ago

Great! I'm closing this, but let me know if you have any other questions!