uvipen / Super-mario-bros-PPO-pytorch

Proximal Policy Optimization (PPO) algorithm for Super Mario Bros
MIT License
1.07k stars 201 forks source link

training from scratch? #11

Closed samiulextreem closed 3 years ago

samiulextreem commented 3 years ago

i want to train the model for world 1-1 from scratch. How many update in the network need to get the result which is shown here?

samiulextreem commented 3 years ago

well it seems like the model completes the stage around 900 episode update