Open Sarah111-AHM opened 1 year ago
Topic | Summary |
---|---|
Competitive Environments | This chapter focuses on competitive environments and adversarial search problems in multi-agent scenarios. Games like chess, Go, and poker are used as examples of competitive environments. Physical games have more complex descriptions and rules compared to board games, but they have received less attention in the AI community. |
Approaches to Multi-Agent Environments in Game Theory | Three approaches to dealing with multi-agent environments in game theory are discussed. |
Minimax Search Algorithm | The chapter covers the minimax search algorithm, which is used to determine optimal moves in adversarial game-tree search. Pruning is introduced as a technique to make the search more efficient by disregarding parts of the search tree that do not affect the optimal move. Evaluation of stopping states involves determining who is winning using a heuristic evaluation function or averaging outcomes from game simulations. |
Games with Chance and Imperfect Information | Games involving chance and imperfect information are discussed. The chapter explains the concepts of game states, legal moves, transition models, terminal tests, and utility functions. The game tree and search tree concepts are introduced, along with their relevance to determining optimal moves and strategies. |
Alpha-Beta Pruning | The minimax search algorithm is explained in detail, including the functions MINIMAX-SEARCH, MAX-VALUE, and MIN-VALUE. The time and space complexity of the minimax algorithm are discussed. Alpha-beta pruning is introduced as a technique to reduce the number of evaluated states in the minimax algorithm. The effectiveness of alpha-beta pruning depends on the order in which states are examined. Move ordering schemes and transposition tables are discussed as ways to improve alpha-beta pruning. |
Type A and Type B Strategies | Type A and Type B strategies are proposed to address the complexity of games with a large number of states. Cutoff tests and heuristic evaluation functions are introduced to handle limited computation time in alpha-beta pruning. The evaluation function should estimate the expected utility of a state and be strongly correlated with the chances of winning. |
Evaluation Functions | Evaluation functions are used in game-playing programs to estimate the likelihood of winning, losing, or drawing in a given game state. They combine weighted features derived from human experience or machine learning techniques. The weights in the evaluation function should be normalized to fall within the range of 0 to 1, representing a loss to a win, respectively. Nonlinear combinations of features are often used to capture feature interdependencies and contextual factors. |
Enhancements to Alpha-Beta Search | The ALPHA-BETA-SEARCH algorithm is modified to incorporate the evaluation function and cutoffs based on a fixed depth limit or iterative deepening. Quiescence search ensures the evaluation function is applied only to quiescent positions without pending moves that significantly impact the evaluation. The horizon effect refers to delaying tactics that temporarily hide strong opponent moves beyond the search horizon. Singular extensions and forward pruning are techniques to mitigate this effect. Beam search is a form of forward pruning where only the n best moves are considered at each ply, but it carries the risk of pruning the best move. PROBCUT is a forward-pruning algorithm that estimates the probability of pruning the best move based on shallow searches and past experience. Late move reduction reduces the search depth for moves appearing later in the move ordering, saving computation time. A combination of these techniques can result in competitive chess-playing programs, with |
depths reaching expert or even grandmaster level. | |
Table Lookup and Search-Based Methods for Chess | Table lookup is often used for opening and endgame phases of chess, while search-based methods are used for the midgame. Computer analysis of endgames involves retrograde minimax search to generate extensive lookup tables for optimal play in endgame scenarios. |
Monte Carlo Tree Search (MCTS) | Monte Carlo tree search (MCTS) is an alternative to alpha-beta search, commonly used in Go programs, estimating state values through simulations and balancing exploration and exploitation using selection policies like UCT. The UCT formula calculates a score for each node in the Monte Carlo Tree Search (MCTS) algorithm, taking into account the average utility, number of playouts, and an exploration term. The MCTS algorithm consists of four steps: selection, expansion, simulation, and back-propagation. MCTS is particularly useful for games with high branching factors or where defining a good evaluation function is challenging. MCTS can be combined with evaluation functions or alpha-beta search techniques for improved performance. MCTS can be applied to new games without pre-defined evaluation functions by relying solely on the game rules. MCTS has limitations, such as potentially overlooking critical moves and requiring longer playouts to verify obvious wins. Monte Carlo search shares similarities with reinforcement learning, involving simulating moves, observing outcomes, and using them to determine good moves. |
Stochastic Games and Backgammon | Stochastic games incorporate a random element, such as dice rolls, to simulate unpredictability. Backgammon is an example of a stochastic game that combines luck and skill, where players move their pieces based on dice rolls. Constructing a game tree for backgammon requires including chance nodes for representing dice rolls and their probabilities. The expectiminimax value is used to make decisions in games with chance nodes, calculating the expected value by averaging over all possible outcomes. Evaluation functions in games of chance, like backgammon, need to consider the relationship between values and probabilities. The computational cost of expectiminimax makes it challenging to search far ahead in most games of chance. |
Partially Observable Games and Kriegspiel | Partially observable games introduce additional challenges due to the element of partial observability, similar to the "fog of war" in real wars. Deterministic partially observable games involve uncertainty about the opponent's choices rather than random events. Kriegspiel is a partially observable variant of chess where each player can only see their own pieces, and a referee announces the outcome of moves. The belief state in Kriegspiel represents all logically possible board states given the history of percepts so far. Solving Kriegspiel involves updating the belief state based on the predictable outcome of the player's own moves and the unpredictable outcome of the opponent's replies. The AND-OR search algorithm and the incremental belief-state algorithm are used to find winning strategies in Kriegspiel. |
Card Games with Random Dealing | Card games with random dealing, like bridge and poker, can be solved by treating the start of the game as a chance node and using expectiminimax. Averaging over clairvoyance, which assumes full observability after the actual deal, can be effective in some games, but it ignores the belief state and information-gathering actions. |
Limitations and Optimization in Game Search Algorithms | Game search algorithms have limitations and approximations due to the complexity of calculating optimal decisions. Metareasoning can optimize the allocation of computational resources in game search algorithms. Higher-level planning and abstract representations can improve the performance of game search algorithms. Humans still excel in games of imperfect information. |
Historical Milestones | |
Historical milestones include early discussions by Babbage, development of the minimax algorithm by Zermelo, and breakthroughs by Shannon, McCarthy, Samuel, and more recent achievements by AlphaGo, AlphaZero, and MuZero. |
Arabic Term | English Translation |
---|---|
محيطات تنافسية | Competitive Environments |
مشاكل البحث المعادية | Adversarial Search Problems |
سيناريوهات متعددة الوكلاء | Multi-Agent Scenarios |
خوارزمية البحث Minimax | Minimax Search Algorithm |
التقليم | Pruning |
وظيفة التقييم | Evaluation Function |
حالات الإيقاف | Stopping States |
الفرصة والمعلومات غير الكاملة | Chance and Imperfect Information |
حالات اللعبة | Game States |
الحركات القانونية | Legal Moves |
نماذج التحول | Transition Models |
اختبارات النهاية | Terminal Tests |
وظائف الفائدة | Utility Functions |
شجرة اللعبة | Game Tree |
شجرة البحث | Search Tree |
تعقيد الوقت والمساحة | Time and Space Complexity |
التقليم ألفا-بيتا | Alpha-Beta Pruning |
مخططات ترتيب الحركات | Move Ordering Schemes |
جداول التبديل | Transposition Tables |
اختبارات الانقطاع | Cutoff Tests |
وظائف التقييم الاستدلالي | Heuristic Evaluation Functions |
تأثير الأفق | Horizon Effect |
البحث بالشعاع | Beam Search |
البحث الهادئ | Quiescence Search |
تمديدات غير عادية | Singular Extensions |
التقليم التقدمي | Forward Pruning |
بحث شجرة مونتي كارلو | Monte Carlo Tree Search (MCTS) |
UCT (الحد الأعلى للثقة في الشجرة) | UCT (Upper Confidence Bound for Trees) |
الألعاب العشوائية | Stochastic Games |
الطاولة | Backgammon |
قيمة توقعات الحد الأعلى | Expectiminimax Value |
الألعاب القابلة للمشاهدة جزئيًا المحددة | Deterministic Partially Observable Games |
كريجسبيل | Kriegspiel |
خوارزمية البحث AND-OR | AND-OR Search Algorithm |
خوارزمية الحالة المعتقدة التدريجية | Incremental Belief-State Algorithm |
الوعي النبهاني | Clairvoyance |
المنطق العليا | Metareasoning |
Summary of the chapter in the form of points