SlateQ agent implementation

facebookresearch / ReAgent

A platform for Reasoning systems (Reinforcement Learning, Contextual Bandits, etc.)

https://reagent.ai

BSD 3-Clause "New" or "Revised" License

3.58k stars 521 forks source link

Open rahul-zomato opened 2 years ago

rahul-zomato commented 2 years ago

SlateQ agent implemented by SlateQ paper authors in recsim uses state instead of next state from replay buffer to get next_q_values - https://github.com/google-research/recsim/issues/26