facebookresearch / ReAgent

A platform for Reasoning systems (Reinforcement Learning, Contextual Bandits, etc.)
https://reagent.ai
BSD 3-Clause "New" or "Revised" License
3.58k stars 521 forks source link

add support for variable number of arms to FbContBanditBatchPreprocessor #701

Closed alexnikulkov closed 1 year ago

alexnikulkov commented 1 year ago

Summary: Make changes to reagent transforms and CB preprocessor to add support for variable number of arms

  1. The main API change is that num_arms=None in FbContBanditBatchPreprocessor, then we use variable-length version of the code. A presence tensor is generated to indicate which arms are present vs 0-padded
  2. Add arm_presence field to CBInput to indicate which arms are present

Differential Revision: D41989361

facebook-github-bot commented 1 year ago

This pull request was exported from Phabricator. Differential Revision: D41989361

facebook-github-bot commented 1 year ago

This pull request has been merged in facebookresearch/ReAgent@5d95e0d4e4cc24cb0378d5c5cd415f8e7a97acd5.