ray-project / ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
33.09k stars 5.6k forks source link

[rllib]: a bug in model sampling from MBMPO's model ensemble with multiple workers #24517

Open anonymous-pusher opened 2 years ago

anonymous-pusher commented 2 years ago

What happened + What you expected to happen

Training with MBMPO using multiple workers gives the following error: (MBMPOTrainer pid=98221) next_obs_batch = self.model.predict_model_batches( (MBMPOTrainer pid=98221) File "/home/jones/anaconda3/envs/ray_latest/lib/python3.8/site-packages/ray/rllib/agents/mbmpo/model_ensemble.py", line 350, in predict_model_batches (MBMPOTrainer pid=98221) delta = self.forward(x).detach().cpu().numpy() (MBMPOTrainer pid=98221) File "/home/jones/anaconda3/envs/ray_latest/lib/python3.8/site-packages/ray/rllib/agents/mbmpo/model_ensemble.py", line 187, in forward (MBMPOTrainer pid=98221) return self.dynamics_ensemble[self.sample_index](x) (MBMPOTrainer pid=98221) IndexError: list index out of range

By debugging, it seems that the model to use from the ensemble is chosen based on : self.sample_index = int((worker_index - 1) / self.num_models)

Then used in forward with: self.dynamics_ensemble[self.sample_index](x)

So if there is only one model self.num_models = 1 but multiple workers worker_index > 1, say 10; The sample index would be 10, so self.dynamics_ensemble[self.sample_index = 10] would throw an error.

a quick fix would be to make it as : self.sample_index = (worker_index - 1) % self.num_models

but there is a comment saying # For each worker, choose a random model to choose trajectories from so it should be the case.

Versions / Dependencies

Version 1.12 but it is still there in the current version as well

Reproduction script

Try with an example but change the spec to have multiple workers and multiple models

Issue Severity

High: It blocks me from completing my task.

kouroshHakha commented 2 years ago

Reproduction script

Try with an example but change the spec to have multiple workers and multiple models

Is it possible for you to provide a minimal script for reproducing this error?