keiohta / tf2rl

TensorFlow2 Reinforcement Learning
MIT License
465 stars 103 forks source link

Wrapping For Absorbing State in case of off-policy GAIL/GAIfO/VAIL (DAC) #141

Closed HesNobi closed 3 years ago

HesNobi commented 3 years ago

Hi, According to the Discriminator-Actor-Critic (DAC), in order to make use of off-policy RL (SAC), it is nessesery for the absorbing states to be processed and rewarded appropriately.

I am wondering if you would address this issue.

Thanks for the open research and the code.

Related to: #127

image