Open chang100 opened 5 years ago
[ ] One-step look ahead policy given by:
Approximate value function may be repersented by alpha vectors computed offline using strategies such as QMDP, FIB or point-based VI, Rollout, etc.
[ ] One-step look ahead policy given by:
Approximate value function may be repersented by alpha vectors computed offline using strategies such as QMDP, FIB or point-based VI, Rollout, etc.