Open ttumiel opened 1 year ago
Yes that would be great. I suggest implementing it based on #338. #338 uses EnvPool's async API, which is equivalent to the regular vec env when async_batch_size = num_envs
.
I was thinking about this issue more and think that you should have two types of observations:
And for these two obs types we need to pair it with corresponding networks.
Feel free to make a PR :) Thanks.
Cc @edbeeching, this PR could help deal with Godot rl environments.
Hi @ttumiel just following up with this. Are you still interested in the issue?
Yes! Sorry about the delay, I'll post a PR soon :)
On Mon, 23 Jan 2023, 18:45 Costa Huang, @.***> wrote:
Hi @ttumiel https://github.com/ttumiel just following up with this. Are you still interested in the issue?
— Reply to this email directly, view it on GitHub https://github.com/vwxyzjn/cleanrl/issues/353#issuecomment-1400655430, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADE4GS4QRNKXLBDWFQRBONLWT2YSFANCNFSM6AAAAAAT7DUPAA . You are receiving this because you were mentioned.Message ID: @.***>
Problem Description
Would it be useful to add a complex (nested/dictionary) action and obs space variant of the PPO algo? I did this for
minerl
and wondered if it would be useful to contribute into the main library? I'd happily make a PR.Checklist
Current Behavior
Currently PPO only supports continuous or discrete actions separately and a single array observation.
Expected Behavior
PPO can support arbitrary complex action and observation spaces.
Possible Solution
tree
to map over actions and observation.