waffoo / accel

accelerate reinforcement learning
MIT License
1 stars 1 forks source link

Features/sac #3

Closed waffoo closed 4 years ago

waffoo commented 4 years ago

SAC is implemented. Replay buffer is also modified. Done flag is replaced with next_valid flag.