Updates: Support the latest Atari environment and state entropy maximization-based exploration.

ikostrikov / pytorch-a2c-ppo-acktr-gail

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

MIT License

3.53k stars 832 forks source link

Update for supporting the latest Atari environment Tested using the following dependences: stable-baselines3==1.5.0 gym==0.21.0 ale-py==0.7.4
Update for supporting state entropy maximization-based exploration Intrinsic rewards can improve the exploration when handling complex environments with high-dimensional observations. Thus I added the following module entitled "State entropy maximization with random encoders for efficient exploration (RE3)". Since RE3 requires no auxiliary models, it won't decrease the computational efficiency. Use --use--sem to invoke it!

ikostrikov / pytorch-a2c-ppo-acktr-gail

Updates: Support the latest Atari environment and state entropy maximization-based exploration. #296