facebookresearch / ReAgent

A platform for Reasoning systems (Reinforcement Learning, Contextual Bandits, etc.)
https://reagent.ai
BSD 3-Clause "New" or "Revised" License
3.58k stars 521 forks source link

Add CB Offline Evaluation to ReAgent #695

Closed alexnikulkov closed 1 year ago

alexnikulkov commented 2 years ago

Summary: Add Offline Evaluation for non-stationary Contextual Bandit policies. This diff includes only the Policy Evaluator algorithms from the LinUCB paper: https://arxiv.org/pdf/1003.0146.pdf (Algorithm 3)

Differential Revision: D41226450

facebook-github-bot commented 2 years ago

This pull request was exported from Phabricator. Differential Revision: D41226450

codecov-commenter commented 2 years ago

Codecov Report

Base: 87.63% // Head: 87.69% // Increases project coverage by +0.05% :tada:

Coverage data is based on head (c5942b7) compared to base (ff1ff09). Patch coverage: 96.62% of modified lines in pull request are covered.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #695 +/- ## ========================================== + Coverage 87.63% 87.69% +0.05% ========================================== Files 365 370 +5 Lines 23678 23825 +147 Branches 44 44 ========================================== + Hits 20751 20894 +143 - Misses 2901 2905 +4 Partials 26 26 ``` | [Impacted Files](https://codecov.io/gh/facebookresearch/ReAgent/pull/695?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=facebookresearch) | Coverage Δ | | |---|---|---| | [reagent/evaluation/cb/base\_evaluator.py](https://codecov.io/gh/facebookresearch/ReAgent/pull/695/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=facebookresearch#diff-cmVhZ2VudC9ldmFsdWF0aW9uL2NiL2Jhc2VfZXZhbHVhdG9yLnB5) | `88.00% <88.00%> (ø)` | | | [...eagent/test/evaluation/cb/test\_policy\_evaluator.py](https://codecov.io/gh/facebookresearch/ReAgent/pull/695/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=facebookresearch#diff-cmVhZ2VudC90ZXN0L2V2YWx1YXRpb24vY2IvdGVzdF9wb2xpY3lfZXZhbHVhdG9yLnB5) | `96.55% <96.55%> (ø)` | | | [reagent/core/types.py](https://codecov.io/gh/facebookresearch/ReAgent/pull/695/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=facebookresearch#diff-cmVhZ2VudC9jb3JlL3R5cGVzLnB5) | `87.31% <100.00%> (+0.26%)` | :arrow_up: | | [reagent/evaluation/cb/policy\_evaluator.py](https://codecov.io/gh/facebookresearch/ReAgent/pull/695/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=facebookresearch#diff-cmVhZ2VudC9ldmFsdWF0aW9uL2NiL3BvbGljeV9ldmFsdWF0b3IucHk=) | `100.00% <100.00%> (ø)` | | | [reagent/evaluation/cb/utils.py](https://codecov.io/gh/facebookresearch/ReAgent/pull/695/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=facebookresearch#diff-cmVhZ2VudC9ldmFsdWF0aW9uL2NiL3V0aWxzLnB5) | `100.00% <100.00%> (ø)` | | | [reagent/models/linear\_regression.py](https://codecov.io/gh/facebookresearch/ReAgent/pull/695/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=facebookresearch#diff-cmVhZ2VudC9tb2RlbHMvbGluZWFyX3JlZ3Jlc3Npb24ucHk=) | `98.18% <100.00%> (ø)` | | | [reagent/test/evaluation/cb/test\_utils.py](https://codecov.io/gh/facebookresearch/ReAgent/pull/695/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=facebookresearch#diff-cmVhZ2VudC90ZXN0L2V2YWx1YXRpb24vY2IvdGVzdF91dGlscy5weQ==) | `100.00% <100.00%> (ø)` | | | [reagent/test/models/test\_linear\_regression\_ucb.py](https://codecov.io/gh/facebookresearch/ReAgent/pull/695/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=facebookresearch#diff-cmVhZ2VudC90ZXN0L21vZGVscy90ZXN0X2xpbmVhcl9yZWdyZXNzaW9uX3VjYi5weQ==) | `100.00% <100.00%> (ø)` | | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=facebookresearch). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=facebookresearch)

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

facebook-github-bot commented 1 year ago

This pull request was exported from Phabricator. Differential Revision: D41226450

facebook-github-bot commented 1 year ago

This pull request has been merged in facebookresearch/ReAgent@25bafe6e3ad4ecf12bc6ab128d31ab140aa8febc.