Open ngocdaothanh opened 6 years ago
The quality of the estimate is controlled with the number of iterations--the higher it is, the more accurate the estimate. Note that it may take many iterations to converge to a good solution if you are dealing with a large problem space. In that case, the algorithm can be improved significantly by introducing domain-specific improvements, e.g. by implementing a more realistic simulation phase that mimics real-life playouts. For that particular reason the getTerminalStateByPerformingSimulationFromState
in the MctsDomainAgent
is left for the user to implement--it gives the ability to inject custom simulation strategies.
With algorithms like alpha-beta or minimax, we can specify search depth to specify strength.
How to specify strength of the mcts?