vietnh1009 / Super-mario-bros-PPO-pytorch

Proximal Policy Optimization (PPO) algorithm for Super Mario Bros
MIT License
1.09k stars 206 forks source link

training from scratch? #11

Closed samiulextreem closed 4 years ago

samiulextreem commented 4 years ago

i want to train the model for world 1-1 from scratch. How many update in the network need to get the result which is shown here?

samiulextreem commented 4 years ago

well it seems like the model completes the stage around 900 episode update