Open mzat-msft opened 1 year ago
Something you can do here is directly restore the RLModule that is inside of the policy instead, either for training or for inference.
Here are some tests that act as pretty good documentation on the new way that we recommend restoring trained policies/RLModules:
Let me know if something like this works for you. Thanks :)
Hi, thanks for your suggestion.
Are you suggesting to basically rebuild the algorithm config, overriding the number of workers, and then use Algorithm.restore()
to load the weights?
IIUC this is equivalent to what I implemented here? https://github.com/Azure/plato/commit/cfba87d06ed0a0d883a39f70e0d393cd0c812391
Yes, we need to fix this. :) We will (in the near future) go back to requiring the user to always bring along their (original or changed) configs when restoring.
For now as a workaround, the following hack should work:
from ray.rllib.utils.checkpoints import get_checkpoint_info
# Instead of calling .from_checkpoint directly, do this procedure:
checkpoint_info = get_checkpoint_info(checkpoint)
state = Algorithm._checkpoint_info_to_algorithm_state(
checkpoint_info=checkpoint_info,
policy_ids=None,
policy_mapping_fn=None,
policies_to_train=None,
)
state["config"] = ... # drop-in your own, altered (num_rollout_workers?) AlgorithmConfig (not old config dict!!) object here.
algo = Algorithm.from_state(state)
# This `algo` should now have/require fewer rollout workers.
What happened + What you expected to happen
When I train an algorithm with tune specifying for example
num_tune_samples=10
and try to restore the best algorithm usingAlgorithm.from_checkpoint()
, Ray tries to get10
CPUs from the machine. If the machine has not enough CPUs available it starts to throw this warning and never restore the algorithm:I would expect this to be portable and to work on any machine I bring the checkpoints along.
Versions / Dependencies
Observed with
ray==2.3.0
andtensorflow==2.11.1
on Linux but I believe it is a common issueReproduction script
Issue Severity
High: It blocks me from completing my task.