The environments of GridWorld and ObjectWorld are all tabular environments, in which the states are discreate and limited. We can easily write down the feature matrix by listing all possible states.
However, when we are dealing with more complicated non-tabular environments (such as Super Mario Game), it's impossible to represent the feature matrix by explicitly listing all possible states, since all states are continuous (e.g. any picture of Super Mario Game at time t) and infinite.
So, how to implement inverse reinforcement learning to deal with non-tabular environment like Super Mario Game? Anyone have any idea about this?
The environments of GridWorld and ObjectWorld are all tabular environments, in which the states are discreate and limited. We can easily write down the feature matrix by listing all possible states. However, when we are dealing with more complicated non-tabular environments (such as Super Mario Game), it's impossible to represent the feature matrix by explicitly listing all possible states, since all states are continuous (e.g. any picture of Super Mario Game at time t) and infinite. So, how to implement inverse reinforcement learning to deal with non-tabular environment like Super Mario Game? Anyone have any idea about this?