tensorflow / agents

TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.
Apache License 2.0
2.78k stars 716 forks source link

Using `tf-agents` for Bandits with sparse data #778

Open ujjwal95 opened 1 year ago

ujjwal95 commented 1 year ago

Hi, I am looking to use tf-agents to develop a multi armed bandit for advertising.

For each observation, I don't have the reward for other arms, because I'll only show that single arm to the observation.

Is tf-agents able to handle such situations? I went through all the Environments and all of them seem to assume that rewards are available for each observation-arm combination. The MovieLens example is handling sparsity using SVD.

Will I need to use similar methods to estimate the reward for other arms? or is there something in tf-agents that I am missing out on?

ujjwal95 commented 1 year ago

Is tf-agents able to train a bandit where we just provide each observation-feature, the arm picked and the reward?