Closed 18goldr closed 2 years ago
Hi @18goldr
Are you using a custom trainer, policy architecture or env_manager? The LLAPI expects a base_env.ActionTuple as an input to set_action/set_action_for_agent which is new to this release.
@andrewcoh Hi, Thank you for your advice. I have encountered the same problem. I look it up in the API description following your advice. I do find the description, which is:
"Set Actions :env.set_actions(behavior_name: str, action: ActionTuple) Sets the actions for a whole agent group. action is an ActionTuple, which is made up of a 2D np.array of dtype=np.int32 for discrete actions, and dtype=np.float32 for continuous actions. The first dimension of np.array in the tuple is the number of agents that requested a decision since the last call to env.step(). The second dimension is the number of discrete or continuous actions for the corresponding array."
However there is no more information. Could you reveal some examples to define the tuple. I can not manage it.
I have some codes, which can work at previous version of ml tool kit. if i < 10: action = np.array([[1.0,0.0]], dtype=np.float32) env.set_actions(single, action) if i == 15: action = np.array([[0.0,1.0]], dtype=np.float32) env.set_actions(single, action) env.step()
Could help me to modify it to work?
@andrewcoh OK , I check the source code and figure it out. Thank you for you information. @18goldr Hi, based on my codes above, I try the following things to fix it.
it should be
from mlagents_envs.base_env import ActionTuple
if i < 10:
action = ActionTuple(np.array([[1.0,0.0]], dtype=np.float32))
env.set_actions(single, action)
if i == 15:
action = ActionTuple(np.array([[0.0,1.0]], dtype=np.float32))
env.set_actions(single, action)
env.step()
Same problem When I run unity3d_env_local.py(rllib example for unity3d environment).
Describe the bug If I run the unity3d_env_local.py(rllib example for unity3d environment) it returns the error like below:
Failure # 1 (occurred at 2020-12-28_15-57-15)
Traceback (most recent call last):
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/tune/trial_runner.py", line 519, in _process_trial
result = self.trial_executor.fetch_result(trial)
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/tune/ray_trial_executor.py", line 497, in fetch_result
result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 47, in wrapper
return func(*args, **kwargs)
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/worker.py", line 1391, in get
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(AttributeError): [36mray::PPO.train()[39m (pid=24483, ip=192.168.0.176)
File "python/ray/_raylet.pyx", line 479, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 431, in ray._raylet.execute_task.function_executor
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 523, in train
raise e
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 509, in train
result = Trainable.train(self)
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/tune/trainable.py", line 183, in train
result = self.step()
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/rllib/agents/trainer_template.py", line 148, in step
res = next(self.train_exec_impl)
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 756, in __next__
return next(self.built_iterator)
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
for item in it:
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
for item in it:
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 843, in apply_filter
for item in it:
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 843, in apply_filter
for item in it:
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
for item in it:
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
for item in it:
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
for item in it:
[Previous line repeated 1 more time]
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 876, in apply_flatten
for item in it:
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 828, in add_wait_hooks
item = next(it)
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
for item in it:
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
for item in it:
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
for item in it:
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/rllib/execution/rollout_ops.py", line 69, in sampler
yield workers.local_worker().sample()
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/rllib/evaluation/rollout_worker.py", line 645, in sample
batches = [self.input_reader.next()]
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 94, in next
batches = [self.get_data()]
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 216, in get_data
item = next(self.rollout_provider)
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 663, in _env_runner
base_env.send_actions(actions_to_send)
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/rllib/env/base_env.py", line 399, in send_actions
obs, rewards, dones, infos = env.step(agent_dict)
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/rllib/env/unity3d_env.py", line 129, in step
action_dict[key])
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/mlagents_envs/environment.py", line 356, in set_action_for_agent
action = action_spec._validate_action(action, None, behavior_name)
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/mlagents_envs/base_env.py", line 404, in _validate_action
if actions.continuous.shape != _expected_shape:
AttributeError: 'numpy.ndarray' object has no attribute 'continuous'
To Reproduce
I followed the steps in the comments in unity3d_env_local.py. I run the script using torch framework but tf also returns the same error.
1) Install Unity3D and pip install mlagents
.
2) Open the Unity3D Editor and load an example scene from the following
ml-agents pip package location:
.../ml-agents/Project/Assets/ML-Agents/Examples/
3) change default framework from tf to torch
4) run the script(3DBall)
Console logs / stack traces
cd /home/jinprelude/Documents/rllib ; /usr/bin/env /home/jinprelude/anaconda3/envs/rllib/bin/python /home/jinprelude/.vscode/extensions/ms-python.python-2020.12.424452561/pythonFiles/lib/python/debugpy/launcher 44291 -- /home/jinprelude/Documents/rllib/run_unity3d.py
WARNING:tensorflow:From /home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
2020-12-28 15:56:50,417 INFO services.py:1171 -- View the Ray dashboard at http://127.0.0.1:8265
== Status ==
Memory usage on this node: 8.3/62.8 GiB
Using FIFO scheduling algorithm.
Resources requested: 1/8 CPUs, 0/1 GPUs, 0.0/33.59 GiB heap, 0.0/11.57 GiB objects (0/1.0 accelerator_type:GTX)
Result logdir: /home/jinprelude/ray_results/PPO
Number of trials: 1/1 (1 RUNNING)
(pid=24483) WARNING:tensorflow:From /home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
(pid=24483) Instructions for updating:
(pid=24483) non-resource variables are not supported in the long term
(pid=24483) 2020-12-28 15:56:54,184 INFO trainer.py:633 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
(pid=24483) No game binary provided, will use a running Unity editor instead.
(pid=24483) Make sure you are pressing the Play (|>) button in your editor to start.
(pid=24483) 2020-12-28 15:57:14,621 INFO trainable.py:102 -- Trainable.setup took 20.438 seconds. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.
(pid=24483) 2020-12-28 15:57:14,621 WARNING util.py:43 -- Install gputil for GPU system monitoring.
(pid=24483) Created UnityEnvironment for port 5004
(pid=24483) 2020-12-28 15:57:14,710 WARNING deprecation.py:30 -- DeprecationWarning: `env_index` has been deprecated. Use `episode.env_id` instead. This will raise an error in the future!
2020-12-28 15:57:15,231 ERROR trial_runner.py:607 -- Trial PPO_unity3d_dd8d1_00000: Error processing event.
Traceback (most recent call last):
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/tune/trial_runner.py", line 519, in _process_trial
result = self.trial_executor.fetch_result(trial)
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/tune/ray_trial_executor.py", line 497, in fetch_result
result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 47, in wrapper
return func(*args, **kwargs)
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/worker.py", line 1391, in get
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(AttributeError): ray::PPO.train() (pid=24483, ip=192.168.0.176)
File "python/ray/_raylet.pyx", line 479, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 431, in ray._raylet.execute_task.function_executor
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 523, in train
raise e
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 509, in train
result = Trainable.train(self)
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/tune/trainable.py", line 183, in train
result = self.step()
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/rllib/agents/trainer_template.py", line 148, in step
res = next(self.train_exec_impl)
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 756, in __next__
return next(self.built_iterator)
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
for item in it:
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
for item in it:
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 843, in apply_filter
for item in it:
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 843, in apply_filter
for item in it:
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
for item in it:
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
for item in it:
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
for item in it:
[Previous line repeated 1 more time]
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 876, in apply_flatten
for item in it:
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 828, in add_wait_hooks
item = next(it)
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
for item in it:
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
for item in it:
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/util/iter.py", line 783, in apply_foreach
for item in it:
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/rllib/execution/rollout_ops.py", line 69, in sampler
yield workers.local_worker().sample()
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/rllib/evaluation/rollout_worker.py", line 645, in sample
batches = [self.input_reader.next()]
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 94, in next
batches = [self.get_data()]
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 216, in get_data
item = next(self.rollout_provider)
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 663, in _env_runner
base_env.send_actions(actions_to_send)
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/rllib/env/base_env.py", line 399, in send_actions
obs, rewards, dones, infos = env.step(agent_dict)
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/ray/rllib/env/unity3d_env.py", line 129, in step
action_dict[key])
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/mlagents_envs/environment.py", line 356, in set_action_for_agent
action = action_spec._validate_action(action, None, behavior_name)
File "/home/jinprelude/anaconda3/envs/rllib/lib/python3.7/site-packages/mlagents_envs/base_env.py", line 404, in _validate_action
if actions.continuous.shape != _expected_shape:
AttributeError: 'numpy.ndarray' object has no attribute 'continuous'
== Status ==
Memory usage on this node: 8.5/62.8 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/8 CPUs, 0/1 GPUs, 0.0/33.59 GiB heap, 0.0/11.57 GiB objects (0/1.0 accelerator_type:GTX)
Result logdir: /home/jinprelude/ray_results/PPO
Number of trials: 1/1 (1 ERROR)
Number of errored trials: 1
+-------------------------+--------------+------------------------------------------------------------------------------------------+
| Trial name | # failures | error file |
|-------------------------+--------------+------------------------------------------------------------------------------------------|
| PPO_unity3d_dd8d1_00000 | 1 | /home/jinprelude/ray_results/PPO/PPO_unity3d_dd8d1_00000_0_2020-12-28_15-56-52/error.txt |
+-------------------------+--------------+------------------------------------------------------------------------------------------+
== Status ==
Memory usage on this node: 8.5/62.8 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/8 CPUs, 0/1 GPUs, 0.0/33.59 GiB heap, 0.0/11.57 GiB objects (0/1.0 accelerator_type:GTX)
Result logdir: /home/jinprelude/ray_results/PPO
Number of trials: 1/1 (1 ERROR)
Number of errored trials: 1
+-------------------------+--------------+------------------------------------------------------------------------------------------+
| Trial name | # failures | error file |
|-------------------------+--------------+------------------------------------------------------------------------------------------|
| PPO_unity3d_dd8d1_00000 | 1 | /home/jinprelude/ray_results/PPO/PPO_unity3d_dd8d1_00000_0_2020-12-28_15-56-52/error.txt |
+-------------------------+--------------+------------------------------------------------------------------------------------------+
Screenshots If applicable, add screenshots to help explain your problem.
Environment (please complete the following information):
Thanks for your hard work!
I have the same issue as @jinPrelude when trying to run 3DBall and Tennis
Ok, I can reproduce the error on the Ray RLlib side now (yes, it's a change in the ML-Agents API for set_action_for_agent
calls). I'll provide a fix in RLlib.
@budbreaker @jinPrelude
Fix for RLlib (will check ML-Agents API version; backward-compatible): https://github.com/ray-project/ray/pull/14569
Thank you SO MUCH for your hard work!! @sven1977. You and RLLib are awesome.
This issue has been automatically marked as stale because it has not had activity in the last 28 days. It will be closed in the next 14 days if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had activity in the last 42 days. If this issue is still valid, please ping a maintainer. Thank you for your contributions.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Describe the bug When I call
env.set_action_for_agent(behavior_name, mlagent_id, action)
, whereaction
is of typenumpy.ndarray
, I get this AttributeError which results frombase_env._validate_action
It looks as though
action
should be converted to abase_env.ActionTuple
at some point, yet it isn't.Rolling back to release 7 fixes the problem.
To Reproduce Call
env.set_action_for_agent
with the appropriate parameters (behavior name, mlagent id, and the action in the form of a numpy.ndarray).Console logs / stack traces
Environment (please complete the following information):
NOTE: I do not currently have time to try this in an example environment. If/when I have time, I will do so, but it seems pretty obvious from the code whats happening and that it would happen in all environments.