Closed Eisfoehniks closed 3 years ago
Thanks for reporting! This looks like a bug. Let me try rolling out a fix and update here.
Please let me know if the issue if fixed for you. Thanks!
I just tested the fix and it works now. Thank you for your effort!
Hi,
I'm currently training a reinforce agent using a value network as a baseline. I'm trying to get seperate loss values for the actor network and the value network to evaluate stability seperately. However the tensor which is returned by tf_agent.train(experience).extra seems to contain only the variable names as strings. I attached an example based on the REINFORCE tutorial below.
The example has the following output for me:
ReinforceAgentLossInfo(policy_gradient_loss=<tf.Tensor: shape=(), dtype=string, numpy=b'policy_gradient_loss'>, policy_network_regularization_loss=<tf.Tensor: shape=(), dtype=string, numpy=b'policy_network_regularization_loss'>, entropy_regularization_loss=<tf.Tensor: shape=(), dtype=string, numpy=b'entropy_regularization_loss'>, value_estimation_loss=<tf.Tensor: shape=(), dtype=string, numpy=b'value_estimation_loss'>, value_network_regularization_loss=<tf.Tensor: shape=(), dtype=string, numpy=b'value_network_regularization_loss'>)
How do I get actual loss values?