Nan values using PPO training methods

ScheiklP / sofa_zoo

Reinforcement learning scripts for sofa_env environments.

MIT License

5 stars 5 forks source link

Nan values using PPO training methods #13

Closed TiagomcMoreira closed 1 month ago

TiagomcMoreira commented 1 month ago

Hey, I'm quite new with coding and SB3 so this might be a simple thing, but while I'm using PPO to train my custom environment in which my learning rate is dependant on the progress inside the model PPO ( def lr(progress):(...) and then model = PPO(learning_rate = lr) ) and some Nan values ocurred After that I introduced some prints inside the function to check on the progress variable and noticed it went to negative values whereas it should be between 0 and 1. Anyone knows anything about this?

ScheiklP commented 1 month ago

Hi @TiagomcMoreira , since this question is a more general topic, and does not relate to sofa_zoo, could you post it over at https://github.com/DLR-RM/stable-baselines3?

Cheers, Paul