Open JChunX opened 3 years ago
This is my attempt at ex 7.2. For the experiment, I used the Markov reward process found in example 7.1 to compare RMS errors for the original n-step method and the sum of TD errors method.
This is my attempt at ex 7.2. For the experiment, I used the Markov reward process found in example 7.1 to compare RMS errors for the original n-step method and the sum of TD errors method.