tensorflow / agents

TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.
Apache License 2.0
2.76k stars 719 forks source link

Contextual Bandit Off-Policy Evaluation #791

Open vitorkrasniqi opened 1 year ago

vitorkrasniqi commented 1 year ago

Hi,

I am currently dealing with "agents/tf_agents/bandits/" . I am wondering where or if the classic Contextual Bandit off-policy evaluation procedures are present in Tensorflow.I mean exactly the following off-policy evaluation procedures:

I mean the evaluation procedures that vowpal_wabbit already uses. Can be found here: https://vowpalwabbit.org/docs/vowpal_wabbit/python/latest/tutorials/python_Contextual_bandits_and_Vowpal_Wabbit.html

Or even more desirable, methods which we can find at the package Open Bandit Pipeline:  https://github.com/st-tech/zr-obp

Before I start thinking about how to integrate the methods from obp in the tensorflow environment, I would like to know if and where these methods can be found at TF Agents.

vitorkrasniqi commented 1 year ago

It is currently not available.

SamanthaSHan commented 1 year ago

Did you end up implementing yourself? Curious if you found any solutions to this