To determine which states are reachable from the current state, we use a grid of sample points which then fall into grid squares representing various states. We want to sample densely enough that we get most possible future states that are reachable from our current state, but not so densely that we waste computation resources. This will likely be the most computationally expensive part of the algorithm, so speed significantly matters. To do this, we can create a supervised ML model. We want to efficiently predict how densely to sample points given our current state. This might be a constant, or it might depend on the current state (e.g. when we are moving fast we might need to sample more). To create the training data, we can pick an arbitrary state, and then sample very densely. This will hit nearly 100% of possible future states. We will then try various other densities of sampling, and decide how sparely we can sample while still hitting an acceptable percentage of states. We repeat this many times. This creates training data for how densely we should sample given a particular input state. We use the ML model to interpolate between the states, telling us how densely we should sample when running the real algorithm. The type of ML model is not yet decided, but can be something relatively simple such as a decision tree.
The first step is to create graphs of sampling densities versus the percentage of states in order to decide on what value of states reached is high enough while not consuming too much computation. We're looking for what's known as the knee of the graph.
The second step is to decide on the percentage of states that we will be aiming to achieve.
The third step is to create lots of data with state-density pairs which achieve that percentage of states.
The fourth step is to train a supervised learning model on the data.
To determine which states are reachable from the current state, we use a grid of sample points which then fall into grid squares representing various states. We want to sample densely enough that we get most possible future states that are reachable from our current state, but not so densely that we waste computation resources. This will likely be the most computationally expensive part of the algorithm, so speed significantly matters. To do this, we can create a supervised ML model. We want to efficiently predict how densely to sample points given our current state. This might be a constant, or it might depend on the current state (e.g. when we are moving fast we might need to sample more). To create the training data, we can pick an arbitrary state, and then sample very densely. This will hit nearly 100% of possible future states. We will then try various other densities of sampling, and decide how sparely we can sample while still hitting an acceptable percentage of states. We repeat this many times. This creates training data for how densely we should sample given a particular input state. We use the ML model to interpolate between the states, telling us how densely we should sample when running the real algorithm. The type of ML model is not yet decided, but can be something relatively simple such as a decision tree.
The first step is to create graphs of sampling densities versus the percentage of states in order to decide on what value of states reached is high enough while not consuming too much computation. We're looking for what's known as the knee of the graph.
The second step is to decide on the percentage of states that we will be aiming to achieve.
The third step is to create lots of data with state-density pairs which achieve that percentage of states.
The fourth step is to train a supervised learning model on the data.