Adjust the framework to be able to schedule storing to the learning role

nick-harder commented 3 months ago

To be able to simulate several markets using one single RL agents we need to be able to save the rewards only after all the modeled markets have been cleared and the final reward has been calculated. This PR brings the required changes to the framework by scheduling the saving to the buffer and policy updates every train_freq. If train_freq is configured to match the closure of all modelled markets, the correct rewards is sved, thus enabling to model several market with one RL agent.

codecov[bot] commented 3 months ago

Codecov Report

Attention: Patch coverage is 92.52336% with 8 lines in your changes missing coverage. Please review.

Project coverage is 78.31%. Comparing base (1b430ff) to head (1cb7765).

Files	Patch %	Lines
assume/common/units_operator.py	90.19%	5 Missing :warning:
assume/strategies/__init__.py	71.42%	2 Missing :warning:
assume/strategies/learning_advanced_orders.py	80.00%	1 Missing :warning:

Additional details and impacted files

```diff @@ Coverage Diff @@ ## main #376 +/- ## ========================================== + Coverage 74.92% 78.31% +3.39% ========================================== Files 43 43 Lines 5120 5132 +12 ========================================== + Hits 3836 4019 +183 + Misses 1284 1113 -171 ``` | [Flag](https://app.codecov.io/gh/assume-framework/assume/pull/376/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=assume-framework) | Coverage Δ | | |---|---|---| | [pytest](https://app.codecov.io/gh/assume-framework/assume/pull/376/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=assume-framework) | `78.31% <92.52%> (+3.39%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=assume-framework#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

nick-harder commented 3 months ago

@maurerle @kim-mskw it would be great if you please could review this PR today so I could continue working on the sequential market testing and implementation. Thanks in advance!

nick-harder commented 3 months ago

I looked over the code, some things are still unclear to me. I think it would be best if kim tests this branch as well to see, that everything gives reasonable results/still works?

Thanks! I have tested learning for single agent and multi agent (2a and 2b) and everyhting works for me. But yeah, it would be great to get a second look since this brings many changes

assume-framework / assume

Adjust the framework to be able to schedule storing to the learning role #376

Codecov Report