[WIP] Simulation, Env and SyntheticDataset - Githubissues

hakuhodo-technologies / scope-rl

SCOPE-RL: A python library for offline reinforcement learning, off-policy evaluation, and selection

https://scope-rl.readthedocs.io/en/latest/

Apache License 2.0

109 stars 11 forks source link

[WIP] Simulation, Env and SyntheticDataset #2

Closed aiueola closed 2 years ago

aiueola commented 3 years ago

Type of change

[x] New feature
[ ] Bugfix
[ ] hotfix
[ ] Other

Description

What has been changed?
- Implemented Env and SyntheticDataset module with synthetic simulation
- Test on the above modules (work in progress)
- quickstart for the data synthesis
How the logic works?
- Constrained Markov Decision Process is defined as follows: https://github.com/negocia-inc/rtb_reinforcement_learing/blob/test/_gym/env/rtb.py#L26-L57
- Each timestep is simulated as follows: https://github.com/negocia-inc/rtb_reinforcement_learing/blob/test/_gym/env/rtb.py#L285-L302
- We have the following interaction between environment and agent: https://github.com/negocia-inc/rtb_reinforcement_learing/blob/test/_gym/env/rtb.py#L126-L143
- Decision maker can also customize their own environment as follows: https://github.com/negocia-inc/rtb_reinforcement_learing/blob/test/_gym/env/wrapper_rtb.py#L112-L137
- (argument: https://github.com/negocia-inc/rtb_reinforcement_learing/blob/test/_gym/env/wrapper_rtb.py#L19-L37)
- Finally, we can obtain dataset as follows: https://github.com/negocia-inc/rtb_reinforcement_learing/blob/test/_gym/dataset/synthetic.py#L46-L111

Checklist

[ ] pass unit test (or unnecessary)
[ ] no errors on newly made test cases
[ ] no errors on existing test cases
[x] applied black formatter
[x] no errors on flake8
[x] no warnings
[x] work in progress

Comments

Any other comments.
- Tests are work in progress.

aiueola commented 3 years ago

@k-kawakami213 ありがとうございます！確認して修正します！