the E vector, or prior over policies, can be plugged into the Agent() class constructor.
E vector stores probabilities -- fixed the q_pi (posterior over policies) calculation so that E is logged into a new variable, lnE before being plugged into the q_pi calculation as q_pi = softmax(neg_efe * gamma - F + lnE)
Features include:
E
vector, or prior over policies, can be plugged into theAgent()
class constructor.E
vector stores probabilities -- fixed theq_pi
(posterior over policies) calculation so thatE
is logged into a new variable,lnE
before being plugged into theq_pi
calculation asq_pi = softmax(neg_efe * gamma - F + lnE)