Closed steveyuwono closed 3 weeks ago
discussion on 06.09.2024:
Hi @steveyuwono @laxmikantbaheti, with regards to your discussion about the transfer of additional data from env to agent: class bf.systems.State already provides kwargs. Means: an env or system can just add additional data to a state. These in turn can be consumed within custom method -compute-action without any further parameters.
Steve's scenario (agent manages an internal action consumption mechanism) -> purely internal detail of a policy Laxmikant's scenario (env provides masks for the agent) -> just add as kwarg within the env/system
I think we can extend the existing wrapper by two new custom methods _get_mask(), _add_to_mask( p_action ) and some additional code in _compute_action():
What do you think?
@steveyuwono you can use the new method State.get_kwargs() but this increases the minimum version of MLPro to > 1.9.0
Description/Motivation Add the Maskable PPO algorithm provided by SB3-Contrib, which is an extension of SB3, to the pool of objects.
Some methods from the basic wrapper need to be readjusted.
On a personal note @steveyuwono, please refer to the implementation of SSD4OR-RL projects!
Task list
Related issues
...
Cross references Documentation of Maskable PPO