Suppose we have an agent that can can be in multiple states.
In each state, it can perform a finite number of actions. Each state can/will have a different set of valid actions.
We want to be able to train this agent to behave optimally.
Is this necessary?
The basic Deep Reinforcement Learning algorithm should be able to learn to behave like a state machine. However, I believe that implementing the problem as a state machine could lead to faster results. I will need to prove this.
Example: Ant bot
Suppose we have an ant which can be in one of two states:
Searching for food
Returning home with food
In each state, the ant can perform actions to move. e.g. LEFT or RIGHT.
However, if the ant has food and is hungry, it can also perform the EAT FOOD action. It can't do this if it has no food.
State Overload
It should be noted that the term State has become overloaded.
In terms of reinforcement learning, State represents the observations fed into the agent. In the future, we will call this the World State for clarity.
In terms of UML State Machines, State represents relevant aspects of the agent's history succinctly. For example, if an ant has left its nest it is searching for food. If it has found food it is returning home. In the future we will call this the Internal State for clarity.
Requirements:
Application of reinforcement learning to UML State Machines: https://github.com/jsphon/StateMachine
Description:
Suppose we have an agent that can can be in multiple states.
In each state, it can perform a finite number of actions. Each state can/will have a different set of valid actions.
We want to be able to train this agent to behave optimally.
Is this necessary?
The basic Deep Reinforcement Learning algorithm should be able to learn to behave like a state machine. However, I believe that implementing the problem as a state machine could lead to faster results. I will need to prove this.
Example: Ant bot
Suppose we have an ant which can be in one of two states:
In each state, the ant can perform actions to move. e.g. LEFT or RIGHT.
However, if the ant has food and is hungry, it can also perform the EAT FOOD action. It can't do this if it has no food.
State Overload
It should be noted that the term State has become overloaded.
In terms of reinforcement learning, State represents the observations fed into the agent. In the future, we will call this the World State for clarity.
In terms of UML State Machines, State represents relevant aspects of the agent's history succinctly. For example, if an ant has left its nest it is searching for food. If it has found food it is returning home. In the future we will call this the Internal State for clarity.
Tasks
Create an Ant Bot