Feature: Changes to Heuristics Calculation

kmbarr commented 4 months ago

I have done my own incomplete implementation of GOAP in the past, but recently started digging into this library. It is a great project and a good head start.

In digging through the resolver code, I came across the heuristic calculation which currently is using the crow-flies distance between actions. Classic GOAP solutions have used the number of remaining unmet criteria as the heuristic, to push the pathfinding algorithm to the best solution rather than the closest. Logically, distance should be added to the cost too, but balancing base cost and distance cost so that one isn’t too dominant takes a bit of work.

[!NOTE] Edit: In further review of the code, it does not appear distance is currently used in the cost (G), and is only used in the heuristic (H), essentially making it the tie-breaker between equal-cost paths.

Related to that, and why I was digging through the code, was I wanted to see if there was a way to dynamically alter the distance multiplier based on locomotion: walking, crawling, injured, flying, etc. There currently is not. I think it would be fairly simple to add such a multiplier to the config [with a default of 1f] but I’m not sure how much demand there is for that functionality, but it could drive things like choosing between “crawlToCover” and “standUp—>walkToCover—>crouch” where the latter becomes the better solution over some threshold distance.

[!NOTE] Edit: This would require a bigger change than noted above, as it appears G is only accumulate from the raw action cost and doesn't include the distance/heuristic calculation. I would envision something more like newG = currentNode.G + this.RunData.Costs[neighborIndex] + this.RunData.MovementCost[neighborIndex] * distance; (where defaulting the MovementCost to 0f would produce the existing behavior); then I'd still lean toward the count of unresolved conditions for the heuristic rather than using the distance for that.

Sorry, lots of edits as I trace deeper into the code *

I also see an issue with Actions that are not grounded to a location [no target, eg inventory actions]. There is code in GoapSetJobRunner that sets a transform target to the player transform for any nodes that do not already have a target; but this is the player's current transform not the expected position at that step of the planning. It would probably make little difference if distance only figures into the heuristic, but becomes more important if there is actually a cost per distance. A solution there would be to leave the nodes without a target with the "InvalidPosition"; there is already some handling of this in the distance calculation, although I think that needs to be improved to track the last valid position and return the distance from there, when the new node has a target, rather than just returning 0f.

crashkonijn commented 4 months ago

Hi there!

Thank you for your kind words and reaching out!

Very insightful things that you've written down and all of them are solid improvements over how things work currently.

Is there any chance you could reach out to me on Discord? Perhaps we can have a call sometime to look at implementing this 😁

crashkonijn commented 4 months ago

Hi @kmbarr,

Would you be so kind to have a look at the PR?

It should include your suggested changes, except for the change in speed per action.

Thanks!

kmbarr commented 4 months ago

Thanks for you quick response on this. Sorry I didn't get back to you sooner. I did actually go back to working on my own implementation of GOAP, with a little new inspiration, but I have also been struggling with the best way to do the costing and heuristic.

I think the presentation I watched a few years back on GOAP that mentioned using the count of unmet conditions as the heuristic did not figure distance into the cost at all, but just a set base cost defaulting to 1, so the heuristic of the unmet goals count would be relatively on the same scale. It does feel like distance should definitely figure in there as well, but that does create the issue of how to balance the two. It's really probably overthinking--you could zero-out the heuristic entirely and A* turns into Dikstra's.

I did want to implement some things like movement penalties, and action factories that would dissolve into multiple actions when multiple targets were available, and a bit of a hybrid with a Utility AI [although that one would be pretty easy within your code, since it's about goal selection not planning, which is left to the user to implement], which is why I went back to my own implementation. That jobs system based planner of yours is really sweet and not something I'm good at, so I'll probably just end-up with an Awaitable. I'm still really impressed at what a great implementation you have made available for free.

crashkonijn commented 4 months ago

Thank you for your reply!

Yeah creating the best GOAP resolver is tough, I guess choices can also depend on what would work best for your game. I guess with this package I have to steer towards what would be the best in general.

I was thinking about heuristics and cost, to decide the 'best' option you would have to resolve the full tree, but that's too heavy for most use cases.

In the current resolver I only go down a single 'tree' for a condition and don't keep track of how much the cost of resolving other conditions would be. I think what an option for improvement could be is to let the cost of the node be the cost of their unresolved conditions/children. Or perhaps that should be the heuristics instead of simply using the count?

kmbarr commented 4 months ago

What I decided on [currently at least] for my own planner, is:

G = cumulative cost, based on a base cost of each action plus the straight-line distance cost between each action multiplied by the applicable movement cost modifier (I think similar to what you're using, with the addition of movement multipliers). H = G / {count of nodes in the plan so far} * {count of remaining unmet conditions}

That was the best I could come-up with for putting H (the projected remaining cost) in roughly the same scale as G.

crashkonijn commented 4 months ago

Hi again!

I was thinking about the resolver and distance based cost. There's still a mistake in the update I've made;

In the current version the cost of an action is the cost of the parent + it's 'base cost' + distance to it's parent, however the distance to the current node from the player should also be added.

Downside is that this added distance cost only applies if that node is the one actually being performed, as soon it becomes a parent node it no longer applies.

H = G / {count of nodes in the plan so far} * {count of remaining unmet conditions}

Very interesting! I will give this a go, thanks!

crashkonijn commented 4 months ago

I just did an update to the PR!

I introduced another variable P, which is the cost without the distance between that node and the player. This does now calculate better, see the Resolve_Should_UseDistanceCorrectly test.

Now that I've implemented it and was looking at your heuristics, wouldn't the effect be the same if I simply added the distance cost to the H value? Since sort is done with F = G + H.

Anyway, very fun discussion!

kmbarr commented 4 months ago

In my own code, I'd originally written the A to solve back-to-front because it should generally be more efficient…but I had such a growing amount of weird workarounds in there—to deal with distance from the agent to the first node, to avoid resolving paths that eventually were out of the agent's range, etc. that I flipped it to solve from the player forward. I may regret that later, but I'm hoping the heuristic will quickly deprioritize all the non-productive paths and the simpler limit on range will eliminate many immediately. Even though the pathfinding code itself is ultimately fairly simple, the whole process is much more complex than a simple path in static 3D space. I also struggled on what defines a unique "node"…intuitively it would be an action, but that definitely doesn't work if you want to allow duplicates of the same action in a plan, and even normally the paths that lead from an action would change depending on the world state before that action. I changed it a few times, usually combining the action's hash with something else, but currently I'm hashing only the resulting world state, so if A finds a cheaper solution to the same world state, it will overwrite it, even if it was a completely different action that led there. It admittedly has only gone through minimal testing. Intellectually, I believe it's right, but intuitively I was still a bit surprised when it didn't break my planner [and generated fewer unused nodes] using an intentionally complex set of preconditions.

crashkonijn commented 4 months ago

To truly tell which action would be the best action, you have resolve the whole tree until you know all costs involved when solving all the unmet conditions. You'd have to fully resolve the future including doing an action multiple times, and possible positions involved.

Another fun dilemma about positions; When doing ClosestAppleTarget = apples.Closest(this.agent) you're only using current closest positions, they might not be when looking from an action that might need it.

Anyway, when using GOAP you have to find a balance between 'perfect' resolving, or simply being able to resolve something at all.

I guess making an NPC seem smart is probably easier and more important then it actually choosing whatever is best.

Edit: I've just updated the heuristics to be the cost of all unmet conditions. I really like how that's behaving in the Unit test 😁

crashkonijn commented 4 months ago

Closing this as it's been merged in the V3 branch!

Thanks again for pointing me in the right direction @kmbarr!

crashkonijn / GOAP

Feature: Changes to Heuristics Calculation #170