This uses the same variables in both MLPs because the first time the function is called, at creation, the tf.get_variable / reuse semantics will dictate that since the variable does not exist it should be created. After the class is initialized, a different user wishing to evaluate the MLP within the class in question at its current variable values but for a different input tensor would query via tf_output(their_input).
This is a lazy use of tf.AUTO_REUSE an should be replaced with the explicit reuse semantics that we are expecting:
a tf-using class (Actor, Critic, NNDynamicsMode) usually follows this pattern:
This uses the same variables in both MLPs because the first time the function is called, at creation, the
tf.get_variable
/ reuse semantics will dictate that since the variable does not exist it should be created. After the class is initialized, a different user wishing to evaluate the MLP within the class in question at its current variable values but for a different input tensor would query viatf_output(their_input)
.This is a lazy use of
tf.AUTO_REUSE
an should be replaced with the explicit reuse semantics that we are expecting: