algorithmsbooks / decisionmaking

Algorithms for Decision Making textbook
522 stars 54 forks source link

Example 17.4 Missing Value #88

Closed dylan-asmar closed 3 years ago

dylan-asmar commented 3 years ago

Example 17.4 has:

capacity = 100 # maximum size of the replay buffer
ExperienceTuple = Tuple{Float64,Float64,Float64,Float64}
M = CircularBuffer{ExperienceTuple}(capacity) # replay buffer
m_grad = 20 # batch size
model = ReplayGradientQLearning(𝒫.𝒜, 𝒫.γ, Q, ∇Q, θ, α, M, m, m_grad)

The value of m is not defined and doesn't appear to have a default value. Obviously, not a big issue since this is an example showing how to apply experience replay.

dylan-asmar commented 3 years ago

I realized this was on purpose based on the graph below