aiplan4eu / unified-planning

The AIPlan4EU Unified Planning Library
Apache License 2.0
181 stars 39 forks source link

Confusing or buggy hash implementation of instantaneous actions #567

Closed fteicht closed 5 months ago

fteicht commented 7 months ago

Describe the bug I don't know if it only concerns instantaneous actions but I noticed that the hashes of semantically equal actions are different (and similarly for the equality test). I would expect 2 semantically equal actions to have the same hash, but it seems that the hash is rather based on the object instance ID, which is weird (for instance, 2 equal ints or floats, even if created at different times, have the same hash and are equal...). Is it really the intended the behavior?

To Reproduce

move = InstantaneousAction("move", robot=Robot, l_from=Location, l_to=Location)
robot = move.parameter("robot")
l_from = move.parameter("l_from")
l_to = move.parameter("l_to")
move.add_precondition(Equals(at(robot), l_from))
move.add_precondition(Not(Equals(l_from, l_to)))
move.add_effect(at(robot), l_to)

move_bis = deepcopy(move)
assert hash(move) == hash(move_bis)  # <= assert exception!
assert move == move_bis  # <= assert exception!

Expected behavior We should have hash(move) == hash(move_bis) and move == move_bis

Framba-Luca commented 7 months ago

Hi @fteicht, we implemented an InstantaneousAction.clone() method (and also for the other classes). I don't know if this fits your use-case.

BTW, the problem with the deepcopy method is that the base-expressions are created trough a factory (the environment.expression_manager) and the deepcopy method does not account for that.

This is why the 2 actions are different, because the basic expressions are semantically equivalent but belong to a different environment (which is a copy of the original one, but it's not the same; indeed if you use simple copy the test above passes).

I don't know if we want the deepcopy and copy methods to work on the unified_planning.model data structure, but in case we do, the clone method should basically be the deepcopy.

fteicht commented 7 months ago

Thanks @Framba-Luca for your explanation, it makes sense.

IMHO the issue is that the environment is then part of the action semantics when you say that 2 equal actions are different just because their environments are. It means that 2 processes which host different environments, and which exchange the same action via pipes for instances, will consider that 2 actions are different. It is an issue when storing, for instance, a policy in one process and asking in the other process what is the action recommending by the policy in a given state (it is excatly the issue I am encountering in the scikit-decide backend of UP btw).

Another view to this semantic issue is that the problem arises if we create the same semantic action twice, and then try to find it in a dictionary or set of actions. It is typically the case when you want to search in a dictionary : you create the key on the heap (an UP action in this case), then try to find it in the dictionary. It is actually how the bug appeared on my side, and I understand that we cannot store UP actions in sets or dictionaries, and then to search for them later on. Am I correct?

fteicht commented 5 months ago

Hi @Framba-Luca !

I have digged further into the issue and I now believe there is a bug in UP when computing the hash value and equality test of grounded actions with simulated effects. At the moment it is preventing the up-skdecide engine from working properly, especially with the scikit-decide's RL engine which allows UP problems to be solved by using Reinforcement Learning. Indeed, this engine relies on grounded actions.

Below is a minimal code example that you can run and that shows that UP simulated action effects prevent grounded actions from being properly compared:

from unified_planning.shortcuts import *
from unified_planning.environment import get_environment
from unified_planning.engines.compilers.grounder import GrounderHelper
from unified_planning.engines.sequential_simulator import UPSequentialSimulator

Location = UserType("Location")
Robot = UserType("Robot")

at = Fluent("at", Location, robot=Robot)
battery_charge = Fluent("battery_charge", IntType(0, 100), robot=Robot)

move = InstantaneousAction("move", robot=Robot, l_from=Location, l_to=Location)
robot = move.parameter("robot")
l_from = move.parameter("l_from")
l_to = move.parameter("l_to")
move.add_precondition(Equals(at(robot), l_from))
move.add_precondition(GE(battery_charge(robot), 10))
move.add_precondition(Not(Equals(l_from, l_to)))
move.add_effect(at(robot), l_to)

def fun(problem, state, actual_params):
    value = state.get_value(battery_charge(actual_params.get(robot))).constant_value()
    return [Int(value - 10)]

# COMMENT IN THE FOLLOWING LINE AND THE LAST ASSERT WILL WORK
move.set_simulated_effect(SimulatedEffect([battery_charge(robot)], fun))

l1 = Object("l1", Location)
l2 = Object("l2", Location)
r1 = Object("r1", Robot)

problem = Problem("robot_with_simulated_effects")
problem.add_fluent(at)
problem.add_fluent(battery_charge)
problem.add_action(move)
problem.add_object(l1)
problem.add_object(l2)
problem.add_object(r1)

problem.set_initial_value(at(r1), l1)
problem.set_initial_value(battery_charge(r1), 100)

simulator = UPSequentialSimulator(problem)

all_actions = set(
    [(a[0], a[1]) for a in GrounderHelper(problem).get_grounded_actions()]
)

app_actions = set(
    [a for a in simulator.get_applicable_actions(simulator.get_initial_state())]
)

# FOLLOWING ASSERT ALWAYS WORKS WITH OR WITHOUT SIMULATED EFFECT
assert app_actions.issubset(all_actions)

all_grounded_actions = set(
    [a[2] for a in GrounderHelper(problem).get_grounded_actions()]
)

app_grounded_actions = set(
    [
        simulator._ground_action(a[0], a[1])
        for a in simulator.get_applicable_actions(simulator.get_initial_state())
    ]
)

# FOLLOWING ASSERT ONLY WORKS WITHOUT SIMULATED EFFECT
assert app_grounded_actions.issubset(all_grounded_actions)

If you comment in the line move.set_simulated_effect(SimulatedEffect([battery_charge(robot)], fun)) you will see that the last assert runs fine. It shows that the issue is related to using UP simulated action effects.

Do you confirm this bug, and if so, would it be please possible to correct it as soon as you can so that I can push working up-skdecide RL-based engine in theAIPlan4EU project?

Thank you in advance for looking into this!

fteicht commented 5 months ago

I have created another issue #585 with a more accurate title since now I believe the bug is different from what I originally thought it would be. Feel free to close this issue and answer in the new one if you prefer. Thank you.

fteicht commented 5 months ago

I am closing this issue since it is now tracked in issue #585