action = get_action(mu, std)[0]?

reinforcement-learning-kr / pg_travel

Policy Gradient algorithms (REINFORCE, NPG, TRPO, PPO)

MIT License

368 stars 76 forks source link

Open xiaoyuanzh opened 2 years ago

xiaoyuanzh commented 2 years ago

In main.py line 93, action = get_action(mu, std)[0] then action is just a scalar. Is that a problem?