Bahador-Bakhshi / 5G-Federation

15 stars 0 forks source link

Monte Carlo vs TD #13

Open Bahador-Bakhshi opened 3 years ago

Bahador-Bakhshi commented 3 years ago

"The next most obvious advantage of TD methods over Monte Carlo methods is that they are naturally implemented in an online, fully incremental fashion. With Monte Carlo methods one must wait until the end of an episode, because only then is the return known, whereas with TD methods one need wait only one time step."

But in our application, we don't need to wait until the end of episode. Each action has a return.

Maybe, it is possible to apply Monte Carlo for this problem