Closed garymm closed 8 months ago
Seems __all__
may be added here in Learner.compile_results
:
https://github.com/ray-project/ray/blob/40223ff75a31c4c3fc490923f9578964102cbc70/rllib/core/learner/learner.py#L752
I'm not really sure what all is suppsoed to be for so I'm not sure where the right place to filter it out is. CC @sven1977
Hey @garymm , thanks for raising this issue!
You are absolutely right, this is causing a problem and needs a fix. We usually don't run anything with the Algorithm's default implementation of training_step
(let alone multi-agent stuff) so this slipped through.
In PPO's training_step method, we do something like:
policies_to_update = set(train_results.keys()) - {ALL_MODULES} # <- ALL_MODULES == "__all__"
and then pass that as policies
into the sync_weights
call. This is similar to your suggestion.
We'll provide a fix-PR ...
In the meantime, you can also take a look at this currently-in-review PR, which brings self-play and league-based self-play into the new API stack, including example scripts (for PPO): https://github.com/ray-project/ray/pull/43276
But this PR will not fix your problem. I'll create a new one.
PR in review: https://github.com/ray-project/ray/pull/43316
What happened + What you expected to happen
I'm trying to get a very basic RL Module working based on the examples and tests in the repo and I hit this issue.
When you run the attached reproduction script, it fails with:
I'm not sure if this is the right fix, but if I change this in Algorithm.training_step:
to:
It seems to fix the problem. Not at all confident that's right now, I'm very new to this code base.
I'm not sure how to work around this without modifying Ray. I'll post if I figure out a work-around.
Versions / Dependencies
2.9.0
Reproduction script
Issue Severity
High: It blocks me from completing my task.