VowpalWabbit / coba

Contextual bandit benchmarking
https://coba-docs.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
48 stars 19 forks source link

New Synthetic Simulations, Environment Filters and Learner Improvements #15

Closed mrucker closed 2 years ago

mrucker commented 2 years ago

This pull request contains two new Synthetic Environments:

  1. KernelSyntheticEnvironment -- Creates a simulated environment where the reward functions are created from kernel basis.
  2. MLPSyntheticEnvironment -- Creates a simulated environment where the reward function is a three layer sigmoid MLP.

This pull request also contains an update to an existing Synthetic Environments:

  1. LinearSyntheticEnvironment -- The distribution of context and action features has been simplified. Previously these had been carefully constructed to guarantee desired distributions of reward values. Now, reward values are shifted and scaled via monte carlo estimates in order to achieve desired distributions.

In addition the Environment changes two new EnvironmentFilters were also added: Noise and Riffle. These filters were added to support more sophisticated experiments when attempting to understand Learner performance and Learner robustness in the face of adverse scenarios.

Finally, moving away from Environments, this pull request also contains several improvements to Learners. A number of bugs with LinUCBLearner were fixed. The default parameterization of LinUCBLearner was improved. Default paramaterization was added to the EpsilonBanditLearner. And parameter options for the VW learners was simplified in the hopes of making VW more accessible to new users.

codecov[bot] commented 2 years ago

Codecov Report

Merging #15 (4f679fb) into master (207d6d4) will increase coverage by 0.00%. The diff coverage is 100.00%.

@@           Coverage Diff            @@
##           master      #15    +/-   ##
========================================
  Coverage   99.82%   99.83%            
========================================
  Files          50       49     -1     
  Lines        4092     4239   +147     
========================================
+ Hits         4085     4232   +147     
  Misses          7        7            
Flag Coverage Δ
unittest 99.83% <100.00%> (+<0.01%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
coba/encodings.py 100.00% <100.00%> (ø)
coba/environments/__init__.py 100.00% <100.00%> (ø)
coba/environments/core.py 100.00% <100.00%> (ø)
coba/environments/filters.py 100.00% <100.00%> (ø)
coba/environments/simulated/synthetics.py 100.00% <100.00%> (ø)
coba/learners/bandit.py 100.00% <100.00%> (ø)
coba/learners/linucb.py 100.00% <100.00%> (ø)
coba/learners/vowpal.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 207d6d4...4f679fb. Read the comment docs.