issues
search
hakuhodo-technologies
/
scope-rl
SCOPE-RL: A python library for offline reinforcement learning, off-policy evaluation, and selection
https://scope-rl.readthedocs.io/en/latest/
Apache License 2.0
109
stars
11
forks
source link
[WIP] Simulation, Env and SyntheticDataset
#2
Closed
aiueola
closed
2 years ago
aiueola
commented
3 years ago
Type of change
[x] New feature
[ ] Bugfix
[ ] hotfix
[ ] Other
Description
What has been changed?
Implemented Env and SyntheticDataset module with synthetic simulation
Test on the above modules (work in progress)
quickstart
for the data synthesis
How the logic works?
Constrained Markov Decision Process is defined as follows:
https://github.com/negocia-inc/rtb_reinforcement_learing/blob/test/_gym/env/rtb.py#L26-L57
Each timestep is simulated as follows:
https://github.com/negocia-inc/rtb_reinforcement_learing/blob/test/_gym/env/rtb.py#L285-L302
We have the following interaction between environment and agent:
https://github.com/negocia-inc/rtb_reinforcement_learing/blob/test/_gym/env/rtb.py#L126-L143
Decision maker can also customize their own environment as follows:
https://github.com/negocia-inc/rtb_reinforcement_learing/blob/test/_gym/env/wrapper_rtb.py#L112-L137
(argument:
https://github.com/negocia-inc/rtb_reinforcement_learing/blob/test/_gym/env/wrapper_rtb.py#L19-L37
)
Finally, we can obtain dataset as follows:
https://github.com/negocia-inc/rtb_reinforcement_learing/blob/test/_gym/dataset/synthetic.py#L46-L111
Checklist
[ ] pass unit test (or unnecessary)
[ ] no errors on newly made test cases
[ ] no errors on existing test cases
[x] applied black formatter
[x] no errors on flake8
[x] no warnings
[x] work in progress
Comments
Any other comments.
Tests are work in progress.
aiueola
commented
3 years ago
@k-kawakami213 ありがとうございます!確認して修正します!
Type of change
Description
Checklist
Comments