GOAP - Githubissues

dt-rush commented 1 year ago

basic implementation that doesn't do list but map states (has trouble doing "get axe, get gloves, go to tree"), and also doesn't use AStar

https://github.com/jamiecollinson/go-goap

This talk is very rich:

https://www.youtube.com/watch?v=gm7K68663rA

Ideas from the talk:

periodically update world state (nearby entities, village state, personal disposition/state) rather than fetch it every time we want a plan

(for example, the state of the village crop level is unlikely to change every time we check for a new plan)
reduce plan lengths as much as possible - even cutting out one action edge (by offloading its management to another system) will decrease complexity of planning.

(for example, movement should not be a series of actions, rather a single action edge: move to ${X}, and another system will handle fetching the path, trying to follow it, etc.)
have variables that segment the valid goals and valid actions

(for example, get irrigation path will never be a valid action for a fight/flight scenario, and similarly, in fight/flight, increase crop count will never be a valid goal)
offload some work to higher-level systems

(for example in a village, if a dangerous animal comes to attack, everyone should get notified that this event is occurring. Thye can then update their current plan with a goal to solve the issue (assist or - i guess - cower, if they have trait cowardly. They don't have to start from the goal assure village success then figure out the relative weights of plant crops, bake bread, etc... the VillageThreatSystem will restrict their goals when in the underAttack state)

(for example there could be a VillageRolesSystem that assigns roles for everyone, so someone is mainly in charge of irrigation and only gets irrigation based goals, someone else is in charge of planting and harvesting, someone else is in charge of milling and baking, someone else is in charge of handicrafts, someone else is in charge of gathering wood, fishing, etc. - reducing the set of goals to potentially evaluate)
- contextual action costs
(for example walking a certain distance costs 3x if it's raining, melee attack cost scales with the melee damage output of the enemy)

dt-rush commented 1 year ago

Interesting youtube comment suggests layering plans with a broad->narrow 3 layer arch:

People watching this should check out LGOAP. It's a layered version of GOAP that allows to generate much longer plans without a massive computational overhead. Basically, the idea is generate a high level plan (build base -> build army -> attack). Then, for each actions world state in the high level plan, generate a mid level plan that satisfies those conditions. So , for build command center the mid level plan would be (build workers -> build barracks). Then, again, for each mid level actions world state, generate a low level plan that satisfies those conditions. So, for build workers it would be (build scv -> mine -> build scv -> mine -> build scv).

So you'd end up with a plan like so (when you concat all the low level plans together):

(build scv -> mine -> build scv -> mine -> build scv) -> (send worker to -> build barracks -> mine) -> (build marine -> build marine -> build marine -> build marine) -> (move army to -> attack on sight)

dt-rush commented 1 year ago

An interesting paper on GOAP, http://alumni.media.mit.edu/~jorkin/WS404OrkinJ.pdf,

suggests mixing symbolic and computational preconditions:

Mixing Symbolic and Non-Symbolic Preconditions An NPC who formulates plans to eliminate threats and stay out of danger needs to know who is alive, who is dead, and who is aiming at whom. It would be prohibitively expensive in terms of memory and processing for each NPC to keep track of the state of everyone else. By taking an agent-centric point-of-view, the NPC can dilute information down to a minimal set of symbols. Rather than maintaining symbols for everyone’s health and current target, the NPC can simply store a single symbol representing whether his current threat is alive, and another indicating whether the threat is currently aiming at the NPC. Outside of the planner, the NPC’s sensors run custom processes to select the current threat, and monitor the threat’s state. The agent-centric strategy solves some problems, but there are still some action preconditions that cannot be precomputed by sensors. An NPC who wants to look at a disturbance needs to know if the point in space of the disturbance’s origin is visible. An NPC who wants to dive into cover needs to know if there is enough room in front of him to play a dramatic animation. There are an infinite number of points in space that the NPC may be interested in, and a large number of animations that the NPC could potentially play. It would be impractical for a sensor to keep track of the visibility to every potentially interesting point in space, or the clearance available for the total translation of every animation. The purpose of representing game state symbolically is to allow the planner to make connections between goals and actions, and actions to other actions. If there are preconditions that the planner is not intended to solve, they do not need to be represented by symbols. For instance, if the planner finds that some point in space is not visible to the NPC, there exists no action that will make it visible. A strategy for handling tests that need to be performed in real-time, like visibility tests of physics collision tests, is to allow actions to contain custom preconditions with arbitrary implementations, known as Context Preconditions. These preconditions may run any piece of code to check an action’s validity in the context of the game world. Context Preconditions provide an alternative to symbolic representation of preconditions, and can be used to prune the search tree while planning. Similarly, actions may have Context Effects, which have arbitrary effects on the game world that do not concern the planner.

dt-rush commented 1 year ago

So, a contextprecondition/contexteffect implementation of atTree would be: Attempted sketch of contextual pre (it has a problem)

planner.AddContextualState(
    "atTree", 
    func get (e* Entity) bool {
        return e.TaggedEntityWithinRadius("tree", 5)
    },
)

planner.AddAction("chopTree",
    []goap.State{
        goap.StateVar{"hasGlove", true},
        goap.StateVar{"hasAxe", true},
        goap.StateVar{"atTree", true},
    },
    []goap.State{
        goap.StateVar{"hasWood", true}
    },
)

planner.AddAction("goToTree",
    []goap.State{},
    []goap.State{
        goap.StateVar{"atTree": true},
    },
)

During planning, if an action is being considered for its precondition, we would do (pseudocode)

satisfied := false
for _, pre := range(action.pres) {
    if pre.(type) == ContextualStateVar {
        satisfied = statisfied && pre.(ContextualStateVar).get(e)
    } else {
        satisfied = ws.contains(pre)
    }
}

But the problem is, we're looking at the entity's current world position according to the component table when really we want to be actually putting the entity's 2D position in the world state such that it can be changed as the effect of goToTree, rather than just storing atTree: true. This way, planning would actually model the entity moving around and be able to check if, for example, as a result of going to the tree, i'm no longer near near the glove that i need, so i should actually get the glove first, so that i could then satisfy the pre of chopWood properly.

dt-rush commented 1 year ago

Instead of looking at the entity's current world position according to the component table, we should use goap.WorldState.GetModal(entity, component) to look first if, in this branch of our search, the entity's position has been set (this would extend to any component data).

Note we also assume that the targetted tree was already set on the worldState as a *Entity, retrieved by world.TaggedEntityWithinRadius("tree", TREE_SEARCH_RADIUS, e.pos)

So we would have,

planner.AddContextualState(
    "atTree", 
    func get (e* Entity, ws goap.WorldState) bool {
        ePos := ws.GetModal(e, "Position").(Vec2D)
        return ePos.BoxDistance(ws.treeTarget.GetVec2D("Box")) < 5
    },
    func set (e *Entity, ws goap.WorldState) goap.WorldState {
        ePos := ws.GetModal(e, "Position").(Vec2D)
        nearTreePos := world.PositionNear(ws.treeTarget.GetVec2D("Position"), ePos)
        ws.setModal(e, "Position", nearTreePos)
        return ws
    }
)

By having the position be get from, and set on a copy-by-value WorldState object, in our planning graph-search, actions involving states that refer to position can keep track of where the entity would be according to the actions that have been performed.

So, if we want woodChopped, and our planning expands like this:

want: woodChopped

traversing edges: [chopTree]

[
path: chopTree
want: {hasAxe, hasGlove, atTree}
state: {woodChopped}
]

traversing edges: [goToTree, getAxe, getGlove]

[
path: goToTree -> chopTree
want: {hasAxe, hasGlove}
state: {atTree, woodChopped}
]

[
path: getAxe -> chopTree
want: {hasGlove, atTree, atAxe}
state: {hasAxe, woodChopped}
]

[
path: getGlove -> chopTree
want: {atGlove, hasAxe, atTree}
state: {hasGlove, woodChopped}
]

This first expansion,

[
path: goToTree -> chopTree
want: {hasAxe, hasGlove}
state: {atTree, woodChopped}
]

Would have been pushed to the queue because goToTree's effect is to satisfy atTree and our entity is not yet at the tree, since

func get (e* Entity, ws goap.WorldState) bool {
        ePos := ws.GetModal(e, "Position").(Vec2D)
        return ePos.BoxDistance(ws.treeTarget.GetVec2D("Box")) < 
    }

returned false.

Then, the next "goTo" action we meet in backtracking will, in its atX contextualstate, see the entity's position being at the tree, because the WorldState in that branch of search was modified by the set() function of atTree. So for example, if the search pops

[
path: goToTree -> chopTree
want: {hasAxe, hasGlove}
state: {atTree, woodChopped}
]

...and then traverses back through getAxe (satisfying hasAxe), we will add atAxe to our wants. But notice, atAxe is not fulfilled because in our WorldState modal position, we are at the tree. Thus, we need to traverse goToAxe. And so, in adding this action in our backtrack, we would call set() for atAxe, modifying in our backtrack such that the modal position is now near the axe.

We would then have satisfied hasAxe and atTree in our path, but we would still need to satisfy hasGlove. A similar process for the glove as for the axe would prepend goToAxe, getAxe. In the end, we would reach the satisfied empty state pre of the action goToGlove with want: {}. And thus we would print the path we took:

goToGlove, getGlove, goToAxe, getAxe, goToTree, chopTree.

But what about the other paths which branched out from originally? for example,

[
path: getGlove -> chopTree
want: {atGlove, hasAxe, atTree}
state: {hasGlove, woodChopped}
]

This still wants atTree. It might expand through:

goToGlove -> getGlove -> chopTree

But now, because of goToGlove setting the modal position at the glove (which for argument's sake is away from the atTree get() < 5 radius), chopTree's atTree pre is unfulfillable and the path should be abandoned. How do we detect this?

We might try to fulfill atTree by inserting goToTree in the path,

goToTree -> goToGlove -> getGlove -> chopTree

... but by playing the action eff's forward by using their set() functions and checking the pre of each action, we would end up with a world state - when we arrive at chopTree's pre - in which atTree is actually false, even though it was true earlier in the chain. So this path, in appending an action that is supposed to produce atTree could not fulfill atTree. It should be abandoned.

dt-rush commented 1 year ago

Ah, a problem. As presented in the last comment, contextual state get() for atTree will prevent the path goToTree -> chopWood from ever being visited if, at the start of planning, we are already at the tree... atTree will be satisfied so we will only expand to satisfy the hasAxe and hasGlove wants... Which will inevitably place us away from the tree, invalidating the paths.

It seems we cannot assume, since we're backtracking that the current entity position (which is where we will start from) is valid to consider for the purposes of what is ultimately the end of the action chain.

On the other hand, if a state is unchanged by the chain, it should be valid to consider it fulfilled from the start of backtracking... Hmmm.....

dt-rush commented 1 year ago

Maybe we should expand all paths backward assuming nothing about the world state, then prune them by forward validation?

At most present world state can validate the last action to be added (last action to be added = its effs fulfill the remainder of the wants)

(Eg if we had axe and had glove and are at tree, chopWood fulfills all wants (woodChopped) and current state can validate it's pre set).

dt-rush commented 1 year ago

First working implementation of the basic algorithm:

https://github.com/dt-rush/sameriver/blob/feature/goap-goap-goap-goap-goap-goap-goap-soap/v2/goap_action.go https://github.com/dt-rush/sameriver/blob/feature/goap-goap-goap-goap-goap-goap-goap-soap/v2/goap_action_set.go https://github.com/dt-rush/sameriver/blob/feature/goap-goap-goap-goap-goap-goap-goap-soap/v2/goap_planner.go https://github.com/dt-rush/sameriver/blob/feature/goap-goap-goap-goap-goap-goap-goap-soap/v2/goap_pqueue.go https://github.com/dt-rush/sameriver/blob/feature/goap-goap-goap-goap-goap-goap-goap-soap/v2/goap_state.go https://github.com/dt-rush/sameriver/blob/feature/goap-goap-goap-goap-goap-goap-goap-soap/v2/goap_world.go https://github.com/dt-rush/sameriver/blob/feature/goap-goap-goap-goap-goap-goap-goap-soap/v2/goap_test.go

dt-rush commented 1 year ago

NOTE: we didn't compute distance costs for movement actions or sort the resulting valid path list by cost

dt-rush commented 1 year ago

Directions for future work:

[x] get planner working (hasn't been tested but i just have a feeling it won't work) for {getBooze, drinkBooze, dropBooze, enterTemple}, where enterTemple.pre={hasBooze: false} and drinkBooze.pre={hasBooze: true}, using a ctx state var that modifies inventory
[ ] produce initial GOAPWorldState for entity's current situation? (periodic scan of world state? check only those modals which become relevant during planning?)
[ ] entity GOAP setup pattern
[ ] entity.IdempotentGetPlan() as a query that puts the request into a queue, to be processed as a runtime limiter by the runtime sharer, returns false if the plan is still waiting to finish or is not available, returns true once available (CAS(1, 0) thereafter)
[ ] plan execution, failure, success, and replanning
[ ] filter valid actionset by disposition (fight/flight activated? roles system (woodsman, baker, etc.)?)
[ ] goal selection algorithm (utility?)

dt-rush commented 1 year ago

[ ] get rid of spuriously "correct" paths that fail modal validation like [goToTree,goToAxe,getAxe,goToGlove,getGlove,chopTree(3)] by specifying temporal ordering if needed in the pres:

chopTree := NewGOAPAction(map[string]any{
        "name": "chopTree",
        "cost": 1,
        "pres": []any{
            map[string]int{
            "hasGlove,>": 0,
            "hasAxe,>":   0,
                    },
                    map[string]any{
            "atTree,=":   1,
            },
        },
        "effs": map[string]int{
            "treeFelled,=": 1,
        },
    })

Notice that pres is now a []map, a list of goals.

meaning, when inserting to satisfy these, insert in an order prior to here.

Main goals can also specify temporal ordering.

The way this works in the implementation side, in the existing algorithm modified, is:

keep track of an interval in the path which corresponds to the area where the satisfying actions will be inserted.

Let's say we are backtracking and reach [A B].

The pres of A are a list of 3 goals: [a1 a2 a3]

Then the actions that satisfy these goals should be inserted before A in ordered areas:

[ ...actionsFor(a1)... , ...actionsFor(a2)... , ...actionsFor(a3)... A B]

This naturally is a recursive structure since any of the actions for a's pres might have temporally-ordered pres.

I have a feeling this means that finding insertionIx for an action to be inserted in the chain is no longer simply i, but involves traversing up the tree of ordered-region start and end indexes.

dt-rush / sameriver

GOAP #31