ikostrikov / pytorch-trpo

PyTorch implementation of Trust Region Policy Optimization
MIT License
433 stars 91 forks source link

The step of t is not necessary in main.py #14

Open LeonardPatrick opened 6 years ago

LeonardPatrick commented 6 years ago

In your main.py, line 147: for t in range(10000): # Don't infinite loop while learning But actually, the t ends at 50, because the env is done in 50 steps. so the range(10000) is so big and not necessary.