Open kronion opened 1 year ago
Defering to eng to determine final priority. It seems like a P1, to me.
This would be really usefull
Useful when I'm trying to recover a checkpoint that has been completed within the previously specified number of iterations and increase the number of iterations required.
Description
I'm trying to restore an RLLib algorithm from a checkpoint and change the configuration before resuming training. My main objective is to change the number of rollout workers between runs, but I may need to adjust other configuration details as well, e.g. env config. I assume this is possible, but I can't find any specific documentation, and the obvious approaches don't seem to work.
For example, this doesn't work:
If I restore a checkpoint from a training session with 5 rollout workers, the new session will also have 5 rollout workers, regardless of what I pass in as
param_space
.I also considered the
Tuner.restore()
API, like this:But the docs specifically say that changing the
param_space
is unsupported: https://docs.ray.io/en/master/tune/api/doc/ray.tune.Tuner.restore.html#ray-tune-tuner-restoreThe closest thing I could find was here in the Tune FAQ: https://docs.ray.io/en/latest/tune/faq.html#how-can-i-continue-training-a-completed-tune-experiment-for-longer-and-with-new-configurations-iterative-experimentation
But it's not clear how to apply this to an RLLib
Algorithm
. It isn't obvious how to extract anAlgorithmConfig
from a checkpoint, modify it, and then build a newAlgorithm
instance.Assuming there's a pattern for how to modify the config, it would be great to add to the documentation. If this isn't actually possible, I think it would be an important feature to add.
Link
No response