Closed kk-55 closed 3 years ago
Seems tf version related. I can confirm the above for tf==2.4.1, but not for tf==2.0.x
Hmm, I'm seeing a couple of bugs on our end. One also has to do with the "simple_optimizer" setting and is unrelated to this issue. Either way, the main problem is the tf version, which does not seem to allow us calling tf.enable_eager_execution() anymore in the "middle".
Quick workaround for now: Could you add this to the very top of your script?
from ray.rllib.utils.framework import try_import_tf
tf1, tf, tfv = try_import_tf()
tf1.enable_eager_execution()
Also, could you set the simple
("simple_optimizer" in the RLlib config) arg to True when using tf2? There is a validation bug that allows this to slip through the cracks. tf-eager should always use the "simple_optimizer" option automatically.
PR with a fix for the above issue:
Next monday I will be back at work and check it out.
Hmm, I'm seeing a couple of bugs on our end. One also has to do with the "simple_optimizer" setting and is unrelated to this issue. Either way, the main problem is the tf version, which does not seem to allow us calling tf.enable_eager_execution() anymore in the "middle".
Quick workaround for now: Could you add this to the very top of your script?
from ray.rllib.utils.framework import try_import_tf tf1, tf, tfv = try_import_tf() tf1.enable_eager_execution()
@sven1977 I can confirm that adding tf1.enable_eager_execution()
to the very top of _shared_weightsmodel.py (and also to my custom script I'm really working on) has fixed the error.
As far as I can see the simple_optimizer
arg has no impact anyway running under my ray/RLlib release (version 2.0.0.dev0).
Btw:
Fixing the above bug led to another bug (_ValueError: Attempt to convert a value (RepeatedValues(...)) with an unsupported type (<class 'ray.rllib.models.repeatedvalues.RepeatedValues'>) to a Tensor.).
In my use case input['obs']
is a dict also including ray.rllib.models.repeated_values.RepeatedValues
and the function _convert_to_tf
in _ray.rllib.policy.eager_tfpolicy.py can only handle RepeatedValues
in an "outer structure" but not in an "inner structure" like in a dict.
As a first workaround I'd fixed it in this way:
x = tf.nest.map_structure(
lambda f: _convert_to_tf(f, d) if isinstance(f, RepeatedValues)
else tf.convert_to_tensor(f, d) if f is not None else None, x)
Awesome @kk-55 ! Thanks for the suggested fix for the map_structure problem in tf-eager. Will PR this now.
Popping in to say that this issue still persists. In my case, tf1.enable_eager_execution was being called in evaluation/rollout_worker.py despite using "tf2" as the framework. The workaround (putting tf1.enable_eager_execution() at the top of every file) fixed the issue, but it took a hot minute to find this thread.
Ray version 1.11.0 Python version 3.9.12 TF version 2.7.0
What is the problem?
Ray version and other system information (Python version, TensorFlow version, OS):
Running the multi agent cartpole example, but w/o using
tune
, I getThis occurs when I manually set up a PPO trainer and choose
framework='tf2'
. That is,TF2SharedWeightsModel
is to be used for "variable sharing" between models/policies. By default, running it w/ tune works. What is the problem causing this value error?Reproduction (REQUIRED)
Mulit agent cartpole example (w/o
tune
but w/framework='tf2'
andtrainer=PPOTrainer(config=config), result=trainer.train()
):