Open Bahador-Bakhshi opened 3 years ago
Does the "Expected SARSA" do better than QL?
Except for the small additional computational cost, Expected Sarsa may completely dominate both of the other more-well-known TD control algorithms.
Does the "Expected SARSA" do better than QL?