RicardoDominguez / PyCREPS

Contextual Relative Entropy Policy Search for Reinforcement Learning in Python
14 stars 1 forks source link

Explore using shared variables with theano #16

Closed RicardoDominguez closed 5 years ago

RicardoDominguez commented 5 years ago

Store in an object the final expresion of the symbolic value of the dual function (self.f). Also store in self R, F and eps.

At each policy update call function which updates R, F and eps (shared variables). Thus compilation only takes place once. Compile function upon the first call (by using a None flag).

RicardoDominguez commented 5 years ago

As tested with 065113ae9678b1614bfb4c3b97d037c9e971c6b2 better performance is to be expected.