Dragon-Zhuang / BPPO

Author's Pytorch implementation of ICLR2023 paper Behavior Proximal Policy Optimization (BPPO).
MIT License
69 stars 5 forks source link