Closed efang96 closed 5 years ago
@efang96 can you post a reproduction script that can be run standalone without FluidsEnv?
Btw, it might be normal to see a requested output of 1 when GAE is enabled, since PPO will attempt to construct a value function with the same model configuration but with 1 output.
This shouldn't be raising an error though. Can you post the full stack trace? I can't find the error you mentioned.
Thanks Eric, that makes sense. Unfortunately it does still error out though. I will work on a reproduction script without the FluidsEnv and post it later today.
The stack trace below is ran with A3C, still the same error though. The print statements for action_dim, action_space, num_outputs
all print 2 (expected and correct). I skipped the other workers for brevity. Please let me know if there's anything else you need! Thanks again.
(fluids) [edward.fang@steropes:/data/efang/low-res-planning/Fluids-v0/qlidar]$ python train_rl.py
pygame 1.9.4
Hello from the pygame community. https://www.pygame.org/contribute.html
Process STDOUT and STDERR is being redirected to /tmp/ray/session_2018-10-21_23-49-13_21607/logs.
Waiting for redis server at 127.0.0.1:37210 to respond...
Waiting for redis server at 127.0.0.1:11471 to respond...
Starting the Plasma object store with 108.15 GB memory.
======================================================================
View the web UI at http://localhost:8893/notebooks/ray_ui.ipynb?token=aa910d58cd59de3b4431cc6189086b066fd35322b3178332
======================================================================
== Status ==
Using FIFO scheduling algorithm.
Created LogSyncer for /home/eecs/edward.fang/ray_results/fluids_qlidar/A3C_FluidsQLidarEnv_0_2018-10-21_23-49-141tcmw8oh ->
== Status ==
Using FIFO scheduling algorithm.
Resources requested: 3/24 CPUs, 0/8 GPUs
Result logdir: /home/eecs/edward.fang/ray_results/fluids_qlidar
RUNNING trials:
- A3C_FluidsQLidarEnv_0: RUNNING
pygame 1.9.4
Hello from the pygame community. https://www.pygame.org/contribute.html
action_dim: 2
action_space: Box(1, 2)
2018-10-21 23:49:57.199615: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-10-21 23:49:57.348582: E tensorflow/stream_executor/cuda/cuda_driver.cc:300] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2018-10-21 23:49:57.348636: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:163] retrieving CUDA diagnostic information for host: steropes
2018-10-21 23:49:57.348650: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:170] hostname: steropes
2018-10-21 23:49:57.348696: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:194] libcuda reported version is: 410.48.0
2018-10-21 23:49:57.348742: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:198] kernel reported version is: 410.48.0
2018-10-21 23:49:57.348754: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:305] kernel version seems to match DSO: 410.48.0
Using custom model LidarConv
num_outputs: 2
[FLUIDS] Loading layout: fluids_state_city
[FLUIDS] Cached layout found
[FLUIDS] Creating objects
[FLUIDS] Generating trajectory map
[FLUIDS] Generating cars
[FLUIDS] Generating peds
[FLUIDS] State creation complete
*** WARNING ***: no episode horizon specified, assuming inf
Error fetching: [<tf.Tensor 'default/add_4:0' shape=(?, 1) dtype=float32>, {'vf_preds': <tf.Tensor 'default/Reshape:0' shape=(?,) dtype=float32>}], feed_dict={<tf.Tensor 'default/Placeholder:0' shape=(?, 1, 17) dtype=float32>: [array([[200. , 165.20196824, 77.59682208, 56.30913861,
50.18209912, 52.46046702, 65.38294434, 108.71226748,
200. , 200. , 77.59682208, 56.30913861,
50.18209912, 52.46046702, 65.38294434, 108.71226748,
200. ]])], <tf.Tensor 'default/action:0' shape=(?, 1) dtype=float32>: [array([[0., 0.]], dtype=float32)], <tf.Tensor 'default/PlaceholderWithDefault:0' shape=() dtype=bool>: True, <tf.Tensor 'default/prev_reward:0' shape=(?,) dtype=float32>: [0.0]}
Exception in thread Thread-1:
Traceback (most recent call last):
File "/data/efang/anaconda3/envs/fluids/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/data/efang/anaconda3/envs/fluids/lib/python3.5/site-packages/ray/rllib/evaluation/sampler.py", line 125, in run
raise e
File "/data/efang/anaconda3/envs/fluids/lib/python3.5/site-packages/ray/rllib/evaluation/sampler.py", line 122, in run
self._run()
File "/data/efang/anaconda3/envs/fluids/lib/python3.5/site-packages/ray/rllib/evaluation/sampler.py", line 136, in _run
item = next(rollout_provider)
File "/data/efang/anaconda3/envs/fluids/lib/python3.5/site-packages/ray/rllib/evaluation/sampler.py", line 378, in _env_runner
eval_results[k] = builder.get(v)
File "/data/efang/anaconda3/envs/fluids/lib/python3.5/site-packages/ray/rllib/utils/tf_run_builder.py", line 48, in get
raise e
File "/data/efang/anaconda3/envs/fluids/lib/python3.5/site-packages/ray/rllib/utils/tf_run_builder.py", line 44, in get
self.feed_dict, os.environ.get("TF_TIMELINE_DIR"))
File "/data/efang/anaconda3/envs/fluids/lib/python3.5/site-packages/ray/rllib/utils/tf_run_builder.py", line 83, in run_timeline
fetches = sess.run(ops, feed_dict=feed_dict)
File "/data/efang/anaconda3/envs/fluids/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 887, in run
run_metadata_ptr)
File "/data/efang/anaconda3/envs/fluids/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1086, in _run
str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (1, 1, 2) for Tensor 'default/action:0', which has shape '(?, 1)'
Worker ip unknown, skipping log sync for /home/eecs/edward.fang/ray_results/fluids_qlidar/A3C_FluidsQLidarEnv_0_2018-10-21_23-49-141tcmw8oh
== Status ==
Using FIFO scheduling algorithm.
Resources requested: 0/24 CPUs, 0/8 GPUs
Result logdir: /home/eecs/edward.fang/ray_results/fluids_qlidar
ERROR trials:
- A3C_FluidsQLidarEnv_0: ERROR, 1 failures: /home/eecs/edward.fang/ray_results/fluids_qlidar/A3C_FluidsQLidarEnv_0_2018-10-21_23-49-141tcmw8oh/error_2018-10-21_23-50-32.txt
== Status ==
Using FIFO scheduling algorithm.
Resources requested: 0/24 CPUs, 0/8 GPUs
Result logdir: /home/eecs/edward.fang/ray_results/fluids_qlidar
ERROR trials:
- A3C_FluidsQLidarEnv_0: ERROR, 1 failures: /home/eecs/edward.fang/ray_results/fluids_qlidar/A3C_FluidsQLidarEnv_0_2018-10-21_23-49-141tcmw8oh/error_2018-10-21_23-50-32.txt
Traceback (most recent call last):
File "train_rl.py", line 52, in <module>
tune.run_experiments(experiment_spec)
File "/data/efang/anaconda3/envs/fluids/lib/python3.5/site-packages/ray/tune/tune.py", line 124, in run_experiments
raise TuneError("Trials did not complete", errored_trials)
ray.tune.error.TuneError: ('Trials did not complete', [A3C_FluidsQLidarEnv_0])
Ok I've reconstructed this without fluids.
import ray
import argparse
import os
import numpy as np
import gym
import tensorflow as tf
from ray import tune
from model import registry
from ray.tune.registry import register_env
from tensorflow import layers
from ray.rllib.models import ModelCatalog, Model
from ray.rllib.models.misc import flatten, normc_initializer
class LidarConv(Model):
def _build_layers(self, inputs, num_outputs, options):
print("num_outputs: ", num_outputs)
with tf.name_scope("1DConv"):
last_layer = tf.transpose(inputs, [0, 2, 1])
last_layer = tf.layers.conv1d(last_layer, 8, 2, activation=tf.nn.relu, name="conv1d_1")
last_layer = tf.layers.conv1d(last_layer, 16, 2, activation=tf.nn.relu, name="conv1d_2")
last_layer = flatten(last_layer)
last_layer = tf.layers.dense(last_layer, 64, activation=tf.nn.relu, name="dense1")
last_layer = tf.layers.dense(last_layer, 64, activation=tf.nn.relu, name="dense2")
output = tf.layers.dense(last_layer, num_outputs, activation=None, name="dense_output")
return output, last_layer
ModelCatalog.register_custom_model("LidarConv", LidarConv)
class CustomEnv(gym.Env):
def __init__(self):
self.observation_space = gym.spaces.Box(
low=0.0,
high=1.0,
shape=(1, 17),
dtype=np.float32)
self.action_space = gym.spaces.Box(
low=-1.0,
high=+1.0,
shape=(1, 2),
dtype=np.float32)
def reset(self):
return np.zeros((1,17))
def step(self, action):
return np.zeros((1,17)), 0, False, {}
def env_creator(env_config):
return CustomEnv()
if __name__ == "__main__":
ray.init(use_raylet=True, redis_password=os.urandom(128).hex())
parser = argparse.ArgumentParser()
parser.add_argument("--checkpoint", type=str, help="Path to checkpoint")
args = parser.parse_args()
checkpoint = args.checkpoint
register_env("CustomEnv", env_creator)
experiment_spec = {
"custom_env": {
"run": "A3C",
"env": "CustomEnv",
"restore": checkpoint,
"config": {
"model": {
"custom_model": "LidarConv",
},
},
# "trial_resources":{
# "cpu": 10,
# "gpu": 1,
# },
"checkpoint_freq": 10,
},
}
tune.run_experiments(experiment_spec)
shape=(1, 2)
Is this intentional? Note that you should have a shape of (2)
if you want two actions, (1, 2)
just adds a empty dimension.
Yeah it's intentional, in some cases we will have multiple agents so the shape will be (N, 2). Right now it's just hardcoded for 1 agent.
I see, did this use to work?
On Tue, Oct 23, 2018, 11:19 AM efang96 notifications@github.com wrote:
Yeah it's intentional, in some cases we will have multiple agents so the shape will be (N, 2). Right now it's just hardcoded for 1 agent.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ray-project/ray/issues/3111#issuecomment-432361373, or mute the thread https://github.com/notifications/unsubscribe-auth/AAA6SukFwS4VUGS0RwE5sbbaPVU7I4phks5un13EgaJpZM4X0nQx .
Actually, this won't really work out due to complications with the action distribution. There is no semantic meaning given to those extra dimensions, so you'd be better off using a Tuple action space in this case (or using the full-blown multi-agent API).
This patch adds a better error message suggesting this: https://github.com/ray-project/ray/pull/3119/files
Got it, thanks! I'll test this out and let you know.
System information
Describe the problem
Coming across an interesting bug when using
ray.tune
. I am defining a custom model which has a functiondef _build_layers(self, inputs, num_outputs, options)
. When printing outnum_outputs
to console, I am seeing that it is alternating between 2 (the correct, expected shape) and 1 (incorrect shape). I'm making sure myaction_space
is of shape 2. The actual error being thrown isexpected tensor shape (?, 1) but got (1, 2)
.Source code / logs
I am running the following command
python train_rl.py
. Below are the important files involved.train_rl.py
custom_models.py