rll / rllab

rllab is a framework for developing and evaluating reinforcement learning algorithms, fully compatible with OpenAI Gym.
Other
2.91k stars 800 forks source link

Cant find where tensor value gets set #231

Closed artcg closed 6 years ago

artcg commented 6 years ago

Hello,

I am looking at this variable in npo.py

old_dist_info_vars = dict(( k, ext.new_tensor( u'old_%s' % k, ndim=2 + is_recurrent, dtype=theano.config.floatX )) for k in dist.dist_info_keys)

Which later is used to calculate the KL divergence when doing an update

dist_info_vars = self.policy.dist_info_sym(obs_var, state_info_vars) kl = dist.kl_sym(old_dist_info_vars, dist_info_vars)

However I have looked all over and cant seem to find where that former tensor (e.g 'old_prob' in case of categorical distribution) gets its value set.

If anyone more familiar with the codebase could point me to it it would be greatly appreciated

artcg commented 6 years ago

Nevermind I found it, the tensor holder gets passed to the 'inputs' arg in update_policy in the optimizer, and then the value from optimize_policy in npo.py is part of the 'all_inputs_values' arg