salesforce / warp-drive

Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning Framework on a GPU (JMLR 2022)
BSD 3-Clause "New" or "Revised" License
465 stars 78 forks source link

Qlearner #73

Closed Emerald01 closed 1 year ago

Emerald01 commented 1 year ago
  1. restructure the algorithm folder so current algorithms are placed in the policy gradient directory
  2. add use_argmax flag for inference replay
  3. add xavier initialization