hubbs5 / or-gym

Environments for OR and RL Research
MIT License
373 stars 93 forks source link

ValueError: Invalid dtype tf.int16 #6

Closed bazok100 closed 3 years ago

bazok100 commented 3 years ago

Hi, I am trying to reproduce the work @ https://www.datahubbs.com/how-to-use-deep-reinforcement-learning-to-improve-your-supply-chain/ but I keep running into ValueError: Invalid dtype tf.int16 on line agent = agents.ppo.PPOTrainer(env=env_name, config=rl_config). Appreciate any help on resolving this.

Thanks

hdavid16 commented 3 years ago

Hi @bazok100. Thanks for reaching out and reporting your issue. Can you provide details on the operating system, python version, and relevant package versions?

FelipeMaldonado commented 3 years ago

Hi @bazok100. Thanks for reaching out and reporting your issue. Can you provide details on the operating system, python version, and relevant package versions?

Dear David, I recently tried the same and I got a similar error (I tried it both in Linux Manjaro Gnome-latest version and in Mac OS Big Sur). Find below the full log of my attempt (apparently the error related to Ray and the latest version of Tensorflow).


WARNING:tensorflow:From ~/anaconda3/envs/experiments/lib/python3.8/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version. Instructions for updating: non-resource variables are not supported in the long term 2021-01-25 00:21:52,933 INFO services.py:1171 -- View the Ray dashboard at http://127.0.0.1:8265 2021-01-25 00:21:54,553 INFO trainer.py:591 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution 2021-01-25 00:21:54,553 INFO trainer.py:616 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags. (pid=273617) WARNING:tensorflow:From ~/anaconda3/envs/experiments/lib/python3.8/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version. (pid=273617) Instructions for updating: (pid=273617) non-resource variables are not supported in the long term (pid=273622) WARNING:tensorflow:From ~/anaconda3/envs/experiments/lib/python3.8/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version. (pid=273622) Instructions for updating: (pid=273622) non-resource variables are not supported in the long term Traceback (most recent call last): File "~/github_things/or-gym/experiments/logistics.py", line 30, in agent = agents.ppo.PPOTrainer(env=env_name, File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/ray/rllib/agents/trainer_template.py", line 106, in init Trainer.init(self, config, env, logger_creator) File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 465, in init super().init(config, logger_creator) File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/ray/tune/trainable.py", line 96, in init self.setup(copy.deepcopy(self.config)) File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 629, in setup self._init(self.config, self.env_creator) File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/ray/rllib/agents/trainer_template.py", line 133, in _init self.workers = self._make_workers( File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 700, in _make_workers return WorkerSet( File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/ray/rllib/evaluation/worker_set.py", line 79, in init remote_spaces = ray.get(self.remote_workers( File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/ray/worker.py", line 1379, in get raise value.as_instanceof_cause() ray.exceptions.RayTaskError(ValueError): ray::RolloutWorker.foreach_policy() (pid=273622, ip=192.168.179.22) File "python/ray/_raylet.pyx", line 422, in ray._raylet.execute_task File "python/ray/_raylet.pyx", line 456, in ray._raylet.execute_task File "python/ray/_raylet.pyx", line 459, in ray._raylet.execute_task File "python/ray/_raylet.pyx", line 463, in ray._raylet.execute_task File "python/ray/_raylet.pyx", line 415, in ray._raylet.execute_task.function_executor File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 460, in init self._build_policy_map(policy_dict, policy_config) File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1076, in _build_policy_map policy_map[name] = cls(obs_space, act_space, merged_conf) File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/ray/rllib/policy/tf_policy_template.py", line 217, in init DynamicTFPolicy.init( File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/ray/rllib/policy/dynamic_tf_policy.py", line 289, in init self.exploration.get_exploration_action( File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/ray/rllib/utils/exploration/stochastic_sampling.py", line 72, in get_exploration_action return self._get_tf_exploration_action_op(action_distribution, File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/ray/rllib/utils/exploration/stochastic_sampling.py", line 78, in _get_tf_exploration_action_op stochastic_actions = tf.cond( File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper return target(*args, kwargs) File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1396, in cond_for_tf_v2 return cond(pred, true_fn=true_fn, false_fn=false_fn, strict=True, name=name) File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper return target(*args, *kwargs) File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(args, kwargs) File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1230, in cond orig_res_t, res_t = context_t.BuildCondBranch(true_fn) File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1066, in BuildCondBranch original_result = fn() File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/ray/rllib/utils/exploration/stochastic_sampling.py", line 81, in self.random_exploration.get_tf_exploration_action_op( File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/ray/rllib/utils/exploration/random.py", line 118, in get_tf_exploration_action_op action = tf.cond( File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper return target(*args, kwargs) File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1396, in cond_for_tf_v2 return cond(pred, true_fn=true_fn, false_fn=false_fn, strict=True, name=name) File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper return target(*args, *kwargs) File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(args, kwargs) File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1230, in cond orig_res_t, res_t = context_t.BuildCondBranch(true_fn) File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1066, in BuildCondBranch original_result = fn() File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/ray/rllib/utils/exploration/random.py", line 111, in true_fn actions = tree.map_structure(random_component, File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/tree/init.py", line 516, in map_structure [func(args) for args in zip(map(flatten, structures))]) File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/tree/init.py", line 516, in [func(args) for args in zip(map(flatten, structures))]) File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/ray/rllib/utils/exploration/random.py", line 91, in random_component return tf.random.uniform( File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper return target(*args, **kwargs) File "~/anaconda3/envs/experiments/lib/python3.8/site-packages/tensorflow/python/ops/random_ops.py", line 282, in random_uniform raise ValueError("Invalid dtype %r" % dtype) ValueError: Invalid dtype tf.int16

hdavid16 commented 3 years ago

@hubbs5 can you check this issue?

hubbs5 commented 3 years ago

What version of Ray and Tensorflow are you using? I ran it with Ray 1.0.0 and Tensorflow 2.3.1 on Ubuntu 16.04 without any issues.

As @FelipeMaldonado pointed out, this seems to be stemming from some Ray and Tensorflow compatibility issue.

FelipeMaldonado commented 3 years ago

What version of Ray and Tensorflow are you using? I ran it with Ray 1.0.0 and Tensorflow 2.3.1 on Ubuntu 16.04 without any issues.

As @FelipeMaldonado pointed out, this seems to be stemming from some Ray and Tensorflow compatibility issue.

Apparently that was the problem. I downgraded the version of Ray to 1.0.0 and it works fine now (maybe you should mention this in the README). Cheers, Felipe

hdavid16 commented 3 years ago

Glad it worked. I've added the Ray version dependency to the README.