The agent learn nothing in Grid-world environment

liuqi8827 commented 4 years ago

Hi Thanks your great work. I'm facing a problem of the visualization performence of the agent in Grid-world environment. I train the agent successfully. However, I found the agent learn nothing in Grid-world environment by visualize.py. The agent chose it's action rundamly. It never success!

I run python -m scripts.train --frames 100000000 --algo ppo --env MiniGrid-MultiRoom-N3r-v0 --model N3r-vib1e6 --save-interval 100 --tb --fullObs --model_type default2 --use_bottleneck --beta 0.000001 And I got the model parameter in the folder /storage/N3r-vib1e6
I run python -m scripts.train --frames 100000000 --algo ppo --env MiniGrid-MultiRoom-N3r-v0 --model N3r-vibS1e6 --save-interval 100 --tb --fullObs --model_type default2 --use_bottleneck --beta 0.000001 --sni_type vib And I got the model parameter in the folder /storage/N3r-vibS1e6
I run plots.py And I got the picture below:
We can find that the return in step 3 is increase monotonously. However, when I run visualize.py for N3r-vib1e6 and N3r-vibS1e6 respectively, I found the agent learn nothing in Grid-world environment. The agent chose it's action rundamly. It never success!

Can you give some suggestions?

Thanks a lot!

maximilianigl commented 4 years ago

Hey,

thank you for the detailed description! Just to check first, are you passing --fullObs to visualize.py as well?

liuqi8827 commented 3 years ago

Thanks for your quick reply

You are right. I did not pass --fullObs to visualize.py. When I run python3 -m scripts.visualize --env MiniGrid-MultiRoom-N3r-v0 --model N3r-vib1e6 --fullObs, it works well.

Thanks a lot!

maximilianigl commented 3 years ago

Great! I'm closing this, but let me know if you have any other questions!

microsoft / IBAC-SNI

The agent learn nothing in Grid-world environment #9