General graph simplification

In general there's not much purpose to having every node we may have defined in the logic, particularly when some nodes serve only for movement time calculations. In AV2, the original dream was to have the solver check every place Indra could deploy the drone and have the drone still be able to attract the portal. This is a bit too complex; we'd have to import the coordinates to the solver or add generated logic. If we remove that need and just create ad-hoc rules as necessary (e.g. Indra has to be in this area to move the portal over here), then we don't need to keep all the spots around. But they are useful for the movement calculations (which are done in python).

What we could do is programmatically build a condensed graph, where we select the nodes we care about from an area (which will be any with a Location, Action, or an Exit to another area), and for each such node, determine paths to each other node (that isn't through another one). In AV2's Lake Amagi > Main Area, for example, we'd remove around 20 nodes and leave around 15.

What would be a little tricky about this are in-area exits; we'd have to AND the requirements together if a path went through multiple such exits, and if we have multiple parallel exits, we could see a combinatorial explosion of exits. Plus the codebase is not equipped to natively understand the requirements. Although the Rust compiler can do a great job simplifying everything if we generate these exits through the python Compiler, python will much slower at the shortest path algorithms. On the other hand, we'd need to modify the can_access interface a little if we perform the condensing in Rust, since as it is currently, each condensed edge would need a reference to all the actual Exit objects it covers.

Here's the general algorithm idea:

For each area:

Mark spots of interest: any spot with a Location, Action, Exit to another Area, or keep: true.
For each spot, build a tree of movements to other spots in the area (basically Dijkstra's). Each edge in this tree should have:
1. a base movement time, if it has one
2. a list of pairs of (movement set, movement time), where movement time is smaller than any subset of the movement set (including the base movement).
3. a list of pairs of (requirement, movement time), corresponding to Exits that move within the same Area, where movement time is smaller than the base movement. If an Exit's destination is already a node in the tree that isn't the child of the Exit's source, a) the time must be less than the base movement time from the source to the destination, and b) this will result in a node having multiple parents.
For every spot of interest in this tree that is not a descendant of a spot of interest besides the root, generate the powerset of movement requirements in the list of edges (excepting the base movement), and for each entry in that powerset, record as a Condensed Edge the combination of requirements and the total sum of the minimum corresponding times for each edge in the tree. This requirement to exclude descendant spots of interest limits the potential combinatorial explosion of edges; we need to track all area edges in the tree anyway to be sure we cover the real shortest path.

For example, if an exit requires an item Ledge_Grab and there is no base movement for the connection, and that edge is in the path to the destination node, then one Condensed Edge will have a requirement of Ledge_Grab and use the base movement for any other edges in the path.

Using AV2's Amagi > Main Area as an example, in the yaml right now it has:

29 spots, of which 12-14 would be considered of interest (the 1-3 element comes from a central set of nodes that might be useful for deploying the drone). My current drawing has another 8 spots not added to the yaml, of which 3 would be considered of interest.
71 local movement connections (directed)
8 in-area exits (directed) If we keep all 3 of the central spots (and ignore the spots not added yet), we would generate 50 pairs of these 14 points to generate Condensed Edges for, and perhaps up to 4 combinations of requirements for each. Obviously if we have a deeper understanding of the rule logic in question we might be able to reduce it further (rather than combine, say, $grab or $climb and $grab or $climb or $hook), or we can let the Rust compiler handle that.

This is more structure in the graph, but in comparing to the number of iterations needed to get through it. Before condensing:

it would take about 29 iterations to generate all possible positions, without doing any actions or accessing locations, and without counting any states that leave the area. Any one action or location will multiply the number of iterations and possible positions, and we have: the Save (2x), the key combo (2x), the flask (2x but after the combo only), and global action Deploy Drone (29x). Without leaving the area once we enter it, we have 3364+ unique states to reach, and the possible number of states we could be in when we enter is even more enormous (not to mention the myriad ways to enter). (And then if we add a side effect to Recall to remember the drone's last position, that's another 29x.)
we check accessibility of each exit (8) once and look up whether we can move in the movement table for every movement edge (71) once.
we would generate 1 state for every edge we can access, which would be 79, assuming we are running Dijkstra's in a single thread (since we are not it will be more), and most of these are discarded as duplicates with equal or worse times. After condensing:
it will take around 14 iterations to generate all possible positions, again without doing any actions or accessing locations. We'd still have the same spot actions and locations, but the global actions diminish, leaving us with only 784 before we consider entry states, the flask, and the Recall changes. Effectively the cost is quadratic in the number of spots in the area, and entry states are going to be polynomial if not exponential in the number of spots as well. As the main bottleneck right now is storage space for all the states we haven't processed, this will be a huge win.
We'd check every edge once, potentially up to 200 if we had 4 combinations of requirements for our 50 pairs. Or we'd precalculate each individual requirement like we do with MovementState.

Also, if we choose to build these condensed edges in Rust instead of generating code for them, then we could actually drop even more spots based on locations we know we can skip... probably.

Now, arguably, we could go to the logical extreme here and apply this graph reduction to the graph as a whole, rather than area by area. This would essentially be precalculating the shortest paths between all the points of interest. The downside would be that we can't perform this full reduction for a map randomizer until after we've determined the game is beatable. For that, though, greedy search should be fast enough, and if not, we can make an access-graph-style beatability checker.

[...]

[...] If an Exit's destination is already a node in the tree that isn't the child of the Exit's source, a) the time must be less than the base movement time from the source to the destination, and b) this will result in a node having multiple parents.

I guess this is true for movements as well. That kind of complicates it further. We can avoid having to check about multiple parents if we just do a depth-first-search on the DAG. We might generate some useless edges, though, if we're not checking mixes of possibilities, but the goal is reducing the number of spots anyway, not the number of edges.

I tagged the wrong issue with the main implementation commit... 😅

The current issue with the BFS approach is correctly choosing when to stop. We need every possible combination of requirements for travelling between nodes A and B represented in its edges, where the requirements are modelled using the movement state (i.e. the movements' requirements) and the exit ids. We can always check whenever we find an edge between A and B whether its requirements are a superset of any we've already found (in which case if it's not faster then we can discard it), but the priority queue still has to terminate to know that we've finished. And so, we need to stop adding new items to the queue by some metric:

limiting each potential edge (base movement, restricted movement, and exit) to once each in the whole DAG rooted at A makes it finish quickly but drops a lot of necessary edges (e.g. A -> C can find a path AB1-BC1 but then will exclude AB2-BC1).
limiting each potential edge to once per path seems like the right idea but results in very snake-y routes with base movement and as many exits as possible, a long runtime, and an explosion of memory usage.

I imagine 2) would be vastly improved by having a max distance per pair of nodes. Either we have that from the base movement + always-exits path, or we have some other minimum requirements... we could probably check for a requirements subset again. But in iterating on the queue, we haven't picked a destination until we reach it, so we can't look at our path and say "oh, there's already a better route to B with less requirements" unless we can say "oh, there's already a better route to every other node with less requirements".

Changing the exit ids to access ids might help a little with the deduplication but doesn't solve the underlying problem... a BFS needs to properly identify when it reaches a state we've already reached. Perhaps we have to record the requirements of each path as we retrieve them (i.e. a hashset of (dest, reqs)), rather than track the individual edges for usage, and as we extend the path we have to check against those recordings?

Finally working, and our results are:

Condensed into 435 total sources, 1631 total edges, 249 interesting spots (from 466), 677 interesting edges

Zannick / logic-graph

General graph simplification #71