Sarah111-AHM / Semsmah

2 stars 0 forks source link

Chapter 6 اساسيات علم البيانات والذكاء الاصطناعي #51

Open Sarah111-AHM opened 1 year ago

Sarah111-AHM commented 1 year ago

Summary of the chapter in the form of points

If the state is terminal, its minimax value is its utility value.

If it is MAX's turn to move, the minimax value is the maximum of the minimax values of its successor states.

If it is MIN's turn to move, the minimax value is the minimum of the minimax values of its successor states.

For terminal states, EVAL(s, p) should equal UTILITY(s, p).

For nonterminal states, the evaluation should fall between the utility values of a loss and a win: UTILITY(loss, p) ≤ EVAL(s, p) ≤ UTILITY(win, p).

Evaluation functions typically calculate features of the state, such as the number of pawns, queens, etc., to categorize states based on their feature values.

Sarah111-AHM commented 1 year ago

funny summary of the summary 😄

Sarah111-AHM commented 1 year ago

summary of the informationin the form of a table ;)

Topic Summary
Competitive Environments This chapter focuses on competitive environments and adversarial search problems in multi-agent scenarios. Games like chess, Go, and poker are used as examples of competitive environments. Physical games have more complex descriptions and rules compared to board games, but they have received less attention in the AI community.
Approaches to Multi-Agent Environments in Game Theory Three approaches to dealing with multi-agent environments in game theory are discussed.
Minimax Search Algorithm The chapter covers the minimax search algorithm, which is used to determine optimal moves in adversarial game-tree search. Pruning is introduced as a technique to make the search more efficient by disregarding parts of the search tree that do not affect the optimal move. Evaluation of stopping states involves determining who is winning using a heuristic evaluation function or averaging outcomes from game simulations.
Games with Chance and Imperfect Information Games involving chance and imperfect information are discussed. The chapter explains the concepts of game states, legal moves, transition models, terminal tests, and utility functions. The game tree and search tree concepts are introduced, along with their relevance to determining optimal moves and strategies.
Alpha-Beta Pruning The minimax search algorithm is explained in detail, including the functions MINIMAX-SEARCH, MAX-VALUE, and MIN-VALUE. The time and space complexity of the minimax algorithm are discussed. Alpha-beta pruning is introduced as a technique to reduce the number of evaluated states in the minimax algorithm. The effectiveness of alpha-beta pruning depends on the order in which states are examined. Move ordering schemes and transposition tables are discussed as ways to improve alpha-beta pruning.
Type A and Type B Strategies Type A and Type B strategies are proposed to address the complexity of games with a large number of states. Cutoff tests and heuristic evaluation functions are introduced to handle limited computation time in alpha-beta pruning. The evaluation function should estimate the expected utility of a state and be strongly correlated with the chances of winning.
Evaluation Functions Evaluation functions are used in game-playing programs to estimate the likelihood of winning, losing, or drawing in a given game state. They combine weighted features derived from human experience or machine learning techniques. The weights in the evaluation function should be normalized to fall within the range of 0 to 1, representing a loss to a win, respectively. Nonlinear combinations of features are often used to capture feature interdependencies and contextual factors.
Enhancements to Alpha-Beta Search The ALPHA-BETA-SEARCH algorithm is modified to incorporate the evaluation function and cutoffs based on a fixed depth limit or iterative deepening. Quiescence search ensures the evaluation function is applied only to quiescent positions without pending moves that significantly impact the evaluation. The horizon effect refers to delaying tactics that temporarily hide strong opponent moves beyond the search horizon. Singular extensions and forward pruning are techniques to mitigate this effect. Beam search is a form of forward pruning where only the n best moves are considered at each ply, but it carries the risk of pruning the best move. PROBCUT is a forward-pruning algorithm that estimates the probability of pruning the best move based on shallow searches and past experience. Late move reduction reduces the search depth for moves appearing later in the move ordering, saving computation time. A combination of these techniques can result in competitive chess-playing programs, with
depths reaching expert or even grandmaster level.
Table Lookup and Search-Based Methods for Chess Table lookup is often used for opening and endgame phases of chess, while search-based methods are used for the midgame. Computer analysis of endgames involves retrograde minimax search to generate extensive lookup tables for optimal play in endgame scenarios.
Monte Carlo Tree Search (MCTS) Monte Carlo tree search (MCTS) is an alternative to alpha-beta search, commonly used in Go programs, estimating state values through simulations and balancing exploration and exploitation using selection policies like UCT. The UCT formula calculates a score for each node in the Monte Carlo Tree Search (MCTS) algorithm, taking into account the average utility, number of playouts, and an exploration term. The MCTS algorithm consists of four steps: selection, expansion, simulation, and back-propagation. MCTS is particularly useful for games with high branching factors or where defining a good evaluation function is challenging. MCTS can be combined with evaluation functions or alpha-beta search techniques for improved performance. MCTS can be applied to new games without pre-defined evaluation functions by relying solely on the game rules. MCTS has limitations, such as potentially overlooking critical moves and requiring longer playouts to verify obvious wins. Monte Carlo search shares similarities with reinforcement learning, involving simulating moves, observing outcomes, and using them to determine good moves.
Stochastic Games and Backgammon Stochastic games incorporate a random element, such as dice rolls, to simulate unpredictability. Backgammon is an example of a stochastic game that combines luck and skill, where players move their pieces based on dice rolls. Constructing a game tree for backgammon requires including chance nodes for representing dice rolls and their probabilities. The expectiminimax value is used to make decisions in games with chance nodes, calculating the expected value by averaging over all possible outcomes. Evaluation functions in games of chance, like backgammon, need to consider the relationship between values and probabilities. The computational cost of expectiminimax makes it challenging to search far ahead in most games of chance.
Partially Observable Games and Kriegspiel Partially observable games introduce additional challenges due to the element of partial observability, similar to the "fog of war" in real wars. Deterministic partially observable games involve uncertainty about the opponent's choices rather than random events. Kriegspiel is a partially observable variant of chess where each player can only see their own pieces, and a referee announces the outcome of moves. The belief state in Kriegspiel represents all logically possible board states given the history of percepts so far. Solving Kriegspiel involves updating the belief state based on the predictable outcome of the player's own moves and the unpredictable outcome of the opponent's replies. The AND-OR search algorithm and the incremental belief-state algorithm are used to find winning strategies in Kriegspiel.
Card Games with Random Dealing Card games with random dealing, like bridge and poker, can be solved by treating the start of the game as a chance node and using expectiminimax. Averaging over clairvoyance, which assumes full observability after the actual deal, can be effective in some games, but it ignores the belief state and information-gathering actions.
Limitations and Optimization in Game Search Algorithms Game search algorithms have limitations and approximations due to the complexity of calculating optimal decisions. Metareasoning can optimize the allocation of computational resources in game search algorithms. Higher-level planning and abstract representations can improve the performance of game search algorithms. Humans still excel in games of imperfect information.
Historical Milestones
Historical milestones include early discussions by Babbage, development of the minimax algorithm by Zermelo, and breakthroughs by Shannon, McCarthy, Samuel, and more recent achievements by AlphaGo, AlphaZero, and MuZero.
Sarah111-AHM commented 1 year ago

important terms and their translations in English ;)

Arabic Term English Translation
محيطات تنافسية Competitive Environments
مشاكل البحث المعادية Adversarial Search Problems
سيناريوهات متعددة الوكلاء Multi-Agent Scenarios
خوارزمية البحث Minimax Minimax Search Algorithm
التقليم Pruning
وظيفة التقييم Evaluation Function
حالات الإيقاف Stopping States
الفرصة والمعلومات غير الكاملة Chance and Imperfect Information
حالات اللعبة Game States
الحركات القانونية Legal Moves
نماذج التحول Transition Models
اختبارات النهاية Terminal Tests
وظائف الفائدة Utility Functions
شجرة اللعبة Game Tree
شجرة البحث Search Tree
تعقيد الوقت والمساحة Time and Space Complexity
التقليم ألفا-بيتا Alpha-Beta Pruning
مخططات ترتيب الحركات Move Ordering Schemes
جداول التبديل Transposition Tables
اختبارات الانقطاع Cutoff Tests
وظائف التقييم الاستدلالي Heuristic Evaluation Functions
تأثير الأفق Horizon Effect
البحث بالشعاع Beam Search
البحث الهادئ Quiescence Search
تمديدات غير عادية Singular Extensions
التقليم التقدمي Forward Pruning
بحث شجرة مونتي كارلو Monte Carlo Tree Search (MCTS)
UCT (الحد الأعلى للثقة في الشجرة) UCT (Upper Confidence Bound for Trees)
الألعاب العشوائية Stochastic Games
الطاولة Backgammon
قيمة توقعات الحد الأعلى Expectiminimax Value
الألعاب القابلة للمشاهدة جزئيًا المحددة Deterministic Partially Observable Games
كريجسبيل Kriegspiel
خوارزمية البحث AND-OR AND-OR Search Algorithm
خوارزمية الحالة المعتقدة التدريجية Incremental Belief-State Algorithm
الوعي النبهاني Clairvoyance
المنطق العليا Metareasoning