jonathan-laurent / AlphaZero.jl

A generic, simple and fast implementation of Deepmind's AlphaZero algorithm.
https://jonathan-laurent.github.io/AlphaZero.jl/stable/
MIT License
1.23k stars 136 forks source link

TicTacToe MCTS example script freezes #148

Open viktmar opened 1 year ago

viktmar commented 1 year ago

When I run the example script for MCTS, it seems to freeze after the first output.

using AlphaZero

gspec = Examples.games["tictactoe"]
mcts = MCTS.Env(gspec, MCTS.RolloutOracle(gspec))
computer = MctsPlayer(mcts, niters=1, timeout=1.0, τ=ConstSchedule(0.5))

# interactive!(gspec, computer, Human())
explore(computer, gspec)

When running the above, I get the following output and nothing else:

Red plays:

       | A B C 
       | D E F 
       | G H I 

 Nmcts
39,980

          Pmcts    UCT  Qmcts
E  92.7%  59.3%  +0.03  +0.03
B   3.5%  11.6%  +0.03  +0.03
C   1.7%   8.1%  +0.03  +0.03
H   0.5%   4.4%  +0.03  +0.02
A   0.5%   4.4%  +0.03  +0.02
I   0.3%   3.5%  +0.03  +0.02
G   0.3%   3.2%  +0.03  +0.02
D   0.2%   2.9%  +0.03  +0.01
F   0.2%   2.6%  +0.03  +0.01

> 

I am working on an M1 Macbook. May this be the issue?

For my adapted case, this does also happen. That is why I tried running the example, but it does the same.

jonathan-laurent commented 1 year ago

Is this still an issue? I cannot replicate on my own machine. If this is still a problem, can you please give me all standard details (Julia and package versions, type of machine...)?

viktmar commented 1 year ago

Yes, unfortunately. I just tried it again to be sure. It continues to run without any progress. I get the following error if I interrupt:

ERROR: LoadError: InterruptException:
Stacktrace:
  [1] try_yieldto(undo::typeof(Base.ensure_rescheduled))
    @ Base ./task.jl:871
  [2] wait()
    @ Base ./task.jl:931
  [3] wait(c::Base.GenericCondition{Base.Threads.SpinLock})
    @ Base ./condition.jl:124
  [4] readuntil(x::Base.TTY, c::UInt8; keep::Bool)
    @ Base ./stream.jl:1012
  [5] readline(s::Base.TTY; keep::Bool)
    @ Base ./io.jl:543
  [6] readline(s::Base.TTY) (repeats 2 times)
    @ Base ./io.jl:542
  [7] start_explorer(exp::AlphaZero.UserInterface.Explorer)
    @ AlphaZero.UserInterface ~/.julia/packages/AlphaZero/p8fyV/src/ui/explorer.jl:260
  [8] #explore#40
    @ ~/.julia/packages/AlphaZero/p8fyV/src/ui/explorer.jl:303 [inlined]
  [9] explore
    @ ~/.julia/packages/AlphaZero/p8fyV/src/ui/explorer.jl:299 [inlined]
 [10] #explore#41
    @ ~/.julia/packages/AlphaZero/p8fyV/src/ui/explorer.jl:307 [inlined]
 [11] explore(player::MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{StaticArraysCore.SVector{9, Union{Nothing, Bool}}, Bool}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Tictactoe.GameSpec}}}, gspec::AlphaZero.Examples.Tictactoe.GameSpec)
    @ AlphaZero.UserInterface ~/.julia/packages/AlphaZero/p8fyV/src/ui/explorer.jl:306
 [12] top-level scope
    @ ~/projects/2022_12_AlphaZeroSR/AlphaZero.jl/main_AlphaZeroSR.jl:54
in expression starting at /Users/viktormartinek/projects/2022_12_AlphaZeroSR/AlphaZero.jl/main_AlphaZeroSR.jl:54

I think the following should contain all the version information, as suggested by the ReinforcementLearning.jl docs:

julia> using Pkg, Dates

julia> today()
2022-10-31

julia> versioninfo()
Julia Version 1.8.2
Commit 36034abf260 (2022-09-29 15:21 UTC)
Platform Info:
  OS: macOS (arm64-apple-darwin21.3.0)
  CPU: 8 × Apple M1
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, apple-m1)
  Threads: 8 on 4 virtual cores
Environment:
  JULIA_EDITOR = code
  JULIA_NUM_THREADS = 8

julia> buff = IOBuffer();Pkg.status(io=buff);println(String(take!(buff)))
Status `~/.julia/environments/v1.8/Project.toml`
  [8ed9eb0b] AlphaZero v0.5.3
  [6e4b80f9] BenchmarkTools v1.3.1
⌅ [336ed68f] CSV v0.9.11
⌃ [7c7805af] Clapeyron v0.2.9
  [aaaa29a8] Clustering v0.14.3
⌅ [31c24e10] Distributions v0.24.18
⌃ [26cc04aa] FiniteDifferences v0.12.24
⌅ [587475ba] Flux v0.12.10
  [f6369f11] ForwardDiff v0.10.32
  [14197337] GenericLinearAlgebra v0.3.3
  [abb756f7] InformationDistances v0.1.0
  [d1acc4aa] IntervalArithmetic v0.20.7
⌅ [8197267c] IntervalSets v0.5.4
  [23fbe1c1] Latexify v0.15.17
  [2fda8390] LsqFit v0.13.0
  [dde4c033] Metal v0.1.2
  [d41bc354] NLSolversBase v7.8.2
  [429524aa] Optim v1.7.3
  [87e2bd06] OptimBase v2.0.2
  [bac558e1] OrderedCollections v1.4.1
  [189a3867] Reexport v1.2.2
  [158674fc] ReinforcementLearning v0.10.1
⌅ [f2b01f46] Roots v1.4.1
  [3f865c0f] ScatteredInterpolation v0.3.6
  [90137ffa] StaticArrays v1.5.9
  [2913bbd2] StatsBase v0.33.21
  [d1185830] SymbolicUtils v0.19.11
⌃ [0c5d862f] Symbolics v4.11.1
⌃ [bd369af6] Tables v1.9.0
  [22787eb5] Term v1.0.4
  [ac1d9e8a] ThreadsX v0.1.11
⌅ [b8865327] UnicodePlots v2.12.4
  [fdbf4ff8] XLSX v0.8.4
  [e88e6eb3] Zygote v0.6.49
Info Packages marked with ⌃ and ⌅ have new versions available, but those with ⌅ are restricted by compatibility constraints from upgrading. To see why use `status --outdated`

I implemented a MCTS version specialized to my usecase and intended to build it up to AlphaZero. But would love to use this library in the long run. What would be the indented output for the example script?

Thank you very much for your help and for the library in general. I already learned much by trying to understand it :)

jonathan-laurent commented 1 year ago

I did not manage to replicate this on my own MacOS machine. Are you running the #master version of AlphaZero.jl on a fresh environment?

One thing that may be causing the freeze is some exception being silently raised in a sub-task. This used to be a problem with previous Julia versions but I though it had been fixed in 1.7.