JuliaReinforcementLearning / ReinforcementLearningAnIntroduction.jl

Julia code for the book Reinforcement Learning An Introduction
https://juliareinforcementlearning.org/ReinforcementLearningAnIntroduction.jl/
MIT License
309 stars 58 forks source link

How to implement a non greedy action selection in Chapter01 #82

Closed ll7 closed 2 years ago

ll7 commented 2 years ago

I am currently going through the notebooks and I don't know how to modify the selelct_action function in Chapter01_Tic_Tac_Toe to use a greedy action selection.

This is my first approach:

explorer = EpsilonGreedyExplorer(0.0)

function select_action(env, V)
    A = legal_action_space(env)
    values = map(A) do a
        V(child(env, a))
    end
    A[explorer(values)]
end

policies2 = policies
policies2.agents[x].policy.policy.mapping = select_action

which results in:

setfield!: immutable struct of type VBasedPolicy cannot be changed

Stacktrace:
 [1] setproperty!(x::VBasedPolicy{MonteCarloLearner{TabularVApproximator{Vector{Float64}, InvDecay}, ReinforcementLearningZoo.FirstVisit, ReinforcementLearningZoo.NoSampling}, typeof(select_action)}, f::Symbol, v::Function)
   @ Base ./Base.jl:43
 [2] top-level scope
   @ /workspaces/ReinforcementLearningAnIntroduction.jl/jupyter-notebook/Chapter01_Tic_Tac_Toe.ipynb:2
findmyway commented 2 years ago

You can create a new instance with https://github.com/jw3126/Setfield.jl

That policy will be made mutable in the next release.