PathmindAI / nativerl

Train reinforcement learning agents using AnyLogic or Python-based simulations
Apache License 2.0
19 stars 4 forks source link

Observation shape error during training pathmind simulation #491

Closed slinlee closed 2 years ago

slinlee commented 2 years ago

On dev.devpathmind.com I was trying to train multi mouse and cheese.

The key error seems like:

 File "/app/conda/lib/python3.7/site-packages/ray/rllib/models/preprocessors.py", line 68, in check_shape
    observation, self._obs_space)
ValueError: ('Observation ({}) outside given space ({})!', array([0. , 0.8, 0. , 0.8]), Box([-inf -inf -inf -inf], [inf inf inf inf], (4,), float32))

More context

Traceback (most recent call last):
  File "/app/conda/lib/python3.7/site-packages/ray/tune/trial_runner.py", line 702, in _process_trial
    results = self.trial_executor.fetch_result(trial)
  File "/app/conda/lib/python3.7/site-packages/ray/tune/ray_trial_executor.py", line 686, in fetch_result
    result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
  File "/app/conda/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 47, in wrapper
    return func(*args, **kwargs)
  File "/app/conda/lib/python3.7/site-packages/ray/worker.py", line 1481, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(ValueError): ray::PPO.train_buffered() (pid=424, ip=10.20.12.117)
  File "python/ray/_raylet.pyx", line 505, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 449, in ray._raylet.execute_task.function_executor
  File "/app/conda/lib/python3.7/site-packages/ray/_private/function_manager.py", line 556, in actor_method_executor
    return method(__ray_actor, *args, **kwargs)
  File "/app/conda/lib/python3.7/site-packages/ray/tune/trainable.py", line 173, in train_buffered
    result = self.train()
  File "/app/conda/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 573, in train
    raise e
  File "/app/conda/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 562, in train
    result = Trainable.train(self)
  File "/app/conda/lib/python3.7/site-packages/ray/tune/trainable.py", line 232, in train
    result = self.step()
  File "/app/conda/lib/python3.7/site-packages/ray/rllib/agents/trainer_template.py", line 162, in step
    res = next(self.train_exec_impl)
  File "/app/conda/lib/python3.7/site-packages/ray/util/iter.py", line 756, in __next__
    return next(self.built_iterator)
  File "/app/conda/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/app/conda/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/app/conda/lib/python3.7/site-packages/ray/util/iter.py", line 843, in apply_filter
    for item in it:
  File "/app/conda/lib/python3.7/site-packages/ray/util/iter.py", line 843, in apply_filter
    for item in it:
  File "/app/conda/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/app/conda/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/app/conda/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  [Previous line repeated 1 more time]
  File "/app/conda/lib/python3.7/site-packages/ray/util/iter.py", line 876, in apply_flatten
    for item in it:
  File "/app/conda/lib/python3.7/site-packages/ray/util/iter.py", line 828, in add_wait_hooks
    item = next(it)
  File "/app/conda/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/app/conda/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/app/conda/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  [Previous line repeated 1 more time]
  File "/app/conda/lib/python3.7/site-packages/ray/util/iter.py", line 471, in base_iterator
    yield ray.get(futures, timeout=timeout)
  File "/app/conda/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 47, in wrapper
    return func(*args, **kwargs)
ray.exceptions.RayTaskError(ValueError): ray::RolloutWorker.par_iter_next() (pid=458, ip=10.20.12.117)
  File "python/ray/_raylet.pyx", line 505, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 449, in ray._raylet.execute_task.function_executor
  File "/app/conda/lib/python3.7/site-packages/ray/_private/function_manager.py", line 556, in actor_method_executor
    return method(__ray_actor, *args, **kwargs)
  File "/app/conda/lib/python3.7/site-packages/ray/util/iter.py", line 1152, in par_iter_next
    return next(self.local_it)
  File "/app/conda/lib/python3.7/site-packages/ray/rllib/evaluation/rollout_worker.py", line 332, in gen_rollouts
    yield self.sample()
  File "/app/conda/lib/python3.7/site-packages/ray/rllib/evaluation/rollout_worker.py", line 706, in sample
    batches = [self.input_reader.next()]
  File "/app/conda/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 96, in next
    batches = [self.get_data()]
  File "/app/conda/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 223, in get_data
    item = next(self.rollout_provider)
  File "/app/conda/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 613, in _env_runner
    sample_collector=sample_collector,
  File "/app/conda/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 808, in _process_observations
    policy_id).transform(raw_obs)
  File "/app/conda/lib/python3.7/site-packages/ray/rllib/models/preprocessors.py", line 187, in transform
    self.check_shape(observation)
  File "/app/conda/lib/python3.7/site-packages/ray/rllib/models/preprocessors.py", line 68, in check_shape
    observation, self._obs_space)
ValueError: ('Observation ({}) outside given space ({})!', array([0. , 0.8, 0. , 0.8]), Box([-inf -inf -inf -inf], [inf inf inf inf], (4,), float32))
2021-11-02 03:55:30,733 INFO trial_runner.py:1009 -- Trial PPO_MultiMouseAndCheese_b50a9_00000: Attempting to restore trial state from last checkpoint.
2021-11-02 03:55:31,000 ERROR trial_runner.py:732 -- Trial PPO_MultiMouseAndCheese_b50a9_00003: Error processing event.
slinlee commented 2 years ago

Reverting to the previous conda fixed training for mouse and cheese examples (and the multi variant).

I'll try to see what was different because we'll need to build a new conda environment to support the model analyzer functionality

kepricon commented 2 years ago

i built a new conda with a downgrade gym(0.21.0 -> 0.18.0) and numpy(1.18.5 -> 1.16.2) as the previous conda had.

please test with https://s3.console.aws.amazon.com/s3/buckets/dh-model-analyzer-static-files.pathmind.com?region=us-east-1&prefix=test/conda/1_3_0/

slinlee commented 2 years ago

training and uploading confirmed to work with new conda