The EUPG algorithm has been updated to incorporate the discount factor (gamma) in the calculation of accrued and future rewards. In the current implementation, gamma is not utilized. Furthermore, the scalarization function in eupg_fishwood.py has been revised to handle scalarization for both episodic rewards and the combined sum of accrued and future rewards
The EUPG algorithm has been updated to incorporate the discount factor (gamma) in the calculation of accrued and future rewards. In the current implementation, gamma is not utilized. Furthermore, the scalarization function in
eupg_fishwood.py
has been revised to handle scalarization for both episodic rewards and the combined sum of accrued and future rewards