Update for tacsl release: CNN tower processing, critic weights loading and freezing.

Denys88 / rl_games

RL implementations

MIT License

848 stars 142 forks source link

Closed iakinola23 closed 1 month ago

iakinola23 commented 2 months ago

enable loading the weights of the critic network from a PPO checkpoint, without the actor weights
add flag to freeze critic while training actor
adding ability to post-process the output of a conv tower with the spatial soft argmax or flatten layer