jonathan-laurent / AlphaZero.jl

A generic, simple and fast implementation of Deepmind's AlphaZero algorithm.
https://jonathan-laurent.github.io/AlphaZero.jl/stable/
MIT License
1.23k stars 137 forks source link

StackOverflowError (during training) #116

Closed StepHaze closed 2 years ago

StepHaze commented 2 years ago

I try to implement a new game. Scripts.test_game and Scripts.dummy_run were passed OK.

After I run julia --project -e 'using AlphaZero; Scripts.train("best")', I get this error message:

Initializing a new AlphaZero environment

Initial report

Number of network parameters: 1,544,262
Number of regularized network parameters: 1,539,712
Memory footprint per MCTS node: 240 bytes

Running benchmark: AlphaZero against MCTS (1000 rollouts)

Progress:   2%|█▌                                                                           |  ETA: 1:25:28StackOverflowError:StackOverflowError:

Stacktrace: [1] Stacktrace:Array

 @  [1]./ boot.jl:457Array [inlined]

@  ./ [2]boot.jl:457  [inlined]Array

 @  [2]./ boot.jl:466Array [inlined]

@  ./ [3]boot.jl:466  [inlined]similar

 @  [3]./ array.jl:378similar [inlined]

@  ./ [4]array.jl:378  [inlined]similar

 @  [4]./ abstractarray.jl:783similar [inlined]

@  ./ [5]abstractarray.jl:783  [inlined]_unsafe_getindex

( #unused# [5]:: IndexLinear_unsafe_getindex, (A#unused#::::VectorIndexLinear{Int64}, , AI::::VectorBase.LogicalIndex{Int64}{Int64, StaticArrays.SVector{5, Bool}}, )I :: @ Base.LogicalIndexBase{Int64, StaticArrays.SVector{5, Bool}} )./ multidimensional.jl:851 @ Base ./ [6]multidimensional.jl:851 _getindex

 @  [6]./ multidimensional.jl:839_getindex [inlined]

@  ./ [7]multidimensional.jl:839  [inlined]getindex

 @  [7]./ abstractarray.jl:1218getindex [inlined]

@  ./ [8]abstractarray.jl:1218  [inlined]available_actions

( game [8]:: AlphaZero.Examples.Best.GameEnvavailable_actions)( game @ ::AlphaZero.GameInterfaceAlphaZero.Examples.Best.GameEnv )~/Downloads/Ju/AlphaZero.jl/src/ game.jl:320 @ AlphaZero.GameInterface ~/Downloads/Ju/AlphaZero.jl/src/ [9]game.jl:320 run_simulation! ( env [9]:: AlphaZero.MCTS.Envrun_simulation!{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}(, envgame::::AlphaZero.MCTS.EnvAlphaZero.Examples.Best.GameEnv{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}; , ηgame::::VectorAlphaZero.Examples.Best.GameEnv{Float64}; , ηroot::::VectorBool{Float64}), root @ ::AlphaZero.MCTSBool )~/Downloads/Ju/AlphaZero.jl/src/ mcts.jl:204 @ AlphaZero.MCTS ~/Downloads/Ju/AlphaZero.jl/src/[10]mcts.jl:204 run_simulation! ( env[10]:: AlphaZero.MCTS.Envrun_simulation!{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}(, envgame::::AlphaZero.MCTS.EnvAlphaZero.Examples.Best.GameEnv{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}; , ηgame::::VectorAlphaZero.Examples.Best.GameEnv{Float64}; , ηroot::::VectorBool{Float64}), (repeats 11814 times)root :: @ BoolAlphaZero.MCTS) (repeats 11814 times)~/Downloads/Ju/AlphaZero.jl/src/ mcts.jl:218 @ AlphaZero.MCTS ~/Downloads/Ju/AlphaZero.jl/src/[11]mcts.jl:218 explore! ( env[11]:: AlphaZero.MCTS.Envexplore!{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}(, envgame::::AlphaZero.MCTS.EnvAlphaZero.Examples.Best.GameEnv{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}, , nsimsgame::::Int64AlphaZero.Examples.Best.GameEnv), nsims @ ::AlphaZero.MCTSInt64 )~/Downloads/Ju/AlphaZero.jl/src/ mcts.jl:243 @ AlphaZero.MCTS ~/Downloads/Ju/AlphaZero.jl/src/[12]mcts.jl:243 think ( p[12]:: MctsPlayerthink{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}}(, pgame::::MctsPlayerAlphaZero.Examples.Best.GameEnv{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}}), game @ ::AlphaZeroAlphaZero.Examples.Best.GameEnv )~/Downloads/Ju/AlphaZero.jl/src/ play.jl:198 @ AlphaZero ~/Downloads/Ju/AlphaZero.jl/src/[13]play.jl:198 think

 @ [13]~/Downloads/Ju/AlphaZero.jl/src/ play.jl:259think [inlined]

@  ~/Downloads/Ju/AlphaZero.jl/src/[14]play.jl:259  [inlined]play_game

( gspec[14]:: AlphaZero.Examples.Best.GameSpecplay_game, (playergspec::::TwoPlayersAlphaZero.Examples.Best.GameSpec{MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.Batchifier.BatchedOracle{AlphaZero.Batchifier.var"#8#9"}}}, MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}}}, ; playerflip_probability::::TwoPlayersFloat64{MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.Batchifier.BatchedOracle{AlphaZero.Batchifier.var"#8#9"}}}, MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}}}); flip_probability @ ::AlphaZeroFloat64 )~/Downloads/Ju/AlphaZero.jl/src/ play.jl:308 @ AlphaZero ~/Downloads/Ju/AlphaZero.jl/src/[15]play.jl:308 (::AlphaZero.var"#simulate_game#70"{TwoPlayers{MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.Batchifier.BatchedOracle{AlphaZero.Batchifier.var"#8#9"}}}, MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}}}, AlphaZero.Benchmark.var"#5#9"{ProgressMeter.Progress}, Simulator{AlphaZero.Benchmark.var"#4#8"{Env{AlphaZero.Examples.Best.GameSpec, ResNet, NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}}, AlphaZero.Benchmark.Duel}, AlphaZero.Benchmark.var"#net#6"{Env{AlphaZero.Examples.Best.GameSpec, ResNet, NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}}, AlphaZero.Benchmark.Duel}, typeof(record_trace)}, AlphaZero.Examples.Best.GameSpec, SimParams}) ( sim_id[15]:: Int64(::AlphaZero.var"#simulate_game#70"{TwoPlayers{MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.Batchifier.BatchedOracle{AlphaZero.Batchifier.var"#8#9"}}}, MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}}}, AlphaZero.Benchmark.var"#5#9"{ProgressMeter.Progress}, Simulator{AlphaZero.Benchmark.var"#4#8"{Env{AlphaZero.Examples.Best.GameSpec, ResNet, NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}}, AlphaZero.Benchmark.Duel}, AlphaZero.Benchmark.var"#net#6"{Env{AlphaZero.Examples.Best.GameSpec, ResNet, NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}}, AlphaZero.Benchmark.Duel}, typeof(record_trace)}, AlphaZero.Examples.Best.GameSpec, SimParams}))( sim_id @ ::AlphaZeroInt64 )~/Downloads/Ju/AlphaZero.jl/src/ simulations.jl:232 @ AlphaZero ~/Downloads/Ju/AlphaZero.jl/src/[16]simulations.jl:232 macro expansion

 @ [16]~/Downloads/Ju/AlphaZero.jl/src/ util.jl:189macro expansion [inlined]

@  ~/Downloads/Ju/AlphaZero.jl/src/[17]util.jl:189  [inlined](::AlphaZero.Util.var"#9#10"{AlphaZero.var"#68#69"{AlphaZero.Benchmark.var"#5#9"{ProgressMeter.Progress}, Simulator{AlphaZero.Benchmark.var"#4#8"{Env{AlphaZero.Examples.Best.GameSpec, ResNet, NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}}, AlphaZero.Benchmark.Duel}, AlphaZero.Benchmark.var"#net#6"{Env{AlphaZero.Examples.Best.GameSpec, ResNet, NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}}, AlphaZero.Benchmark.Duel}, typeof(record_trace)}, AlphaZero.Examples.Best.GameSpec, SimParams, AlphaZero.var"#48#49"{Channel{Any}}, AlphaZero.var"#make#65"{Channel{Any}}}, UnitRange{Int64}, typeof(vcat), ReentrantLock})

( )[17] @ (::AlphaZero.Util.var"#9#10"{AlphaZero.var"#68#69"{AlphaZero.Benchmark.var"#5#9"{ProgressMeter.Progress}, Simulator{AlphaZero.Benchmark.var"#4#8"{Env{AlphaZero.Examples.Best.GameSpec, ResNet, NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}}, AlphaZero.Benchmark.Duel}, AlphaZero.Benchmark.var"#net#6"{Env{AlphaZero.Examples.Best.GameSpec, ResNet, NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}}, AlphaZero.Benchmark.Duel}, typeof(record_trace)}, AlphaZero.Examples.Best.GameSpec, SimParams, AlphaZero.var"#48#49"{Channel{Any}}, AlphaZero.var"#make#65"{Channel{Any}}}, UnitRange{Int64}, typeof(vcat), ReentrantLock})AlphaZero.Util( )~/.julia/packages/ThreadPools/hwwUU/src/ macros.jl:261 @ AlphaZero.Util ~/.julia/packages/ThreadPools/hwwUU/src/macros.jl:261

Please help

StepHaze commented 2 years ago

Ok. I took scripts\mcts.jl, changed "tictactoe" to my game and ran it. After few moves against AI, I got the error message

ERROR: LoadError: StackOverflowError: Stacktrace: [1] objectid @ ./reflection.jl:302 [inlined] [2] hash @ ./hashing.jl:25 [inlined] [3] hash(t::Tuple{AlphaZero.Examples.Best.Board, Int64}, h::UInt64) @ Base ./tuple.jl:417 [4] hash @ ./namedtuple.jl:195 [inlined] [5] hash @ ./hashing.jl:20 [inlined] [6] hashindex @ ./dict.jl:169 [inlined] [7] ht_keyindex(h::Dict{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.StateInfo}, key::NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}) @ Base ./dict.jl:284 [8] haskey @ ./dict.jl:552 [inlined] [9] state_info(env::AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}, state::NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}) @ AlphaZero.MCTS ~/Downloads/Ju/AlphaZero.jl/src/mcts.jl:166 [10] run_simulation!(env::AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}, game::AlphaZero.Examples.Best.GameEnv; η::Vector{Float64}, root::Bool) @ AlphaZero.MCTS ~/Downloads/Ju/AlphaZero.jl/src/mcts.jl:205 [11] run_simulation!(env::AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}, game::AlphaZero.Examples.Best.GameEnv; η::Vector{Float64}, root::Bool) (repeats 23785 times) @ AlphaZero.MCTS ~/Downloads/Ju/AlphaZero.jl/src/mcts.jl:218 [12] explore!(env::AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}, game::AlphaZero.Examples.Best.GameEnv, nsims::Int64) @ AlphaZero.MCTS ~/Downloads/Ju/AlphaZero.jl/src/mcts.jl:243 [13] think(p::MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}}, game::AlphaZero.Examples.Best.GameEnv) @ AlphaZero ~/Downloads/Ju/AlphaZero.jl/src/play.jl:202 [14] select_move @ ~/Downloads/Ju/AlphaZero.jl/src/play.jl:49 [inlined] [15] select_move(p::TwoPlayers{MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}}, Human}, game::AlphaZero.Examples.Best.GameEnv, turn::Int64) @ AlphaZero ~/Downloads/Ju/AlphaZero.jl/src/play.jl:265 [16] interactive!(game::AlphaZero.Examples.Best.GameEnv, player::TwoPlayers{MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}}, Human}) @ AlphaZero ~/Downloads/Ju/AlphaZero.jl/src/play.jl:364 [17] interactive! @ ~/Downloads/Ju/AlphaZero.jl/src/play.jl:375 [inlined] [18] interactive!(game::AlphaZero.Examples.Best.GameSpec, white::MctsPlayer{AlphaZero.MCTS.Env{NamedTuple{(:board, :curplayer), Tuple{AlphaZero.Examples.Best.Board, Int64}}, AlphaZero.MCTS.RolloutOracle{AlphaZero.Examples.Best.GameSpec}}}, black::Human) @ AlphaZero ~/Downloads/Ju/AlphaZero.jl/src/play.jl:377 [19] top-level scope @ ~/Downloads/Ju/AlphaZero.jl/scripts/mcts.jl:8 [20] include(fname::String) @ Base.MainInclude ./client.jl:451 [21] top-level scope @ none:1 in expression starting at /home/haze/Downloads/Ju/AlphaZero.jl/scripts/mcts.jl:8

Please give me the right direction.

EngrStudent commented 2 years ago

I am getting overflow of gpu memory errors too.

EngrStudent commented 2 years ago

I re-ran using tic-tac toe game, and kept the System Monitor and the nvidia-smi tool to track hardware usage before the freeze. My video froze, and the memory was at very high usage, so I had the thought that it was a memory overflow problem.

I then looked through the paramters (params.jl) and found a "memory buffer size" variable that I reduced by an order of magnitude (80k --> 8k). The code ran without crashing but learning was poor. I increased size to 40k, and it both ran and learned. There is a trade-off between "small enough to not crash" and "big enough to not act like a lobotomy". Manually hand-holding that is going to be a pain, but it is one way to limp forward.

jonathan-laurent commented 2 years ago

Thanks for the feedback. I will be adding an option to store memory buffer samples on disk.

smart-fr commented 1 year ago

Thanks for the feedback. I will be adding an option to store memory buffer samples on disk.

Yes please! 😘