reinforcement-learning-kr / pg_travel

Policy Gradient algorithms (REINFORCE, NPG, TRPO, PPO)
MIT License
368 stars 76 forks source link

action = get_action(mu, std)[0]? #19

Open xiaoyuanzh opened 2 years ago

xiaoyuanzh commented 2 years ago

In main.py line 93, action = get_action(mu, std)[0] then action is just a scalar. Is that a problem?