Simulated action effects break grounded action comparison

fteicht commented 8 months ago

Hello,

This bug report originally comes from issue #567 but digging further into the issue reveals that the potential bug is different from what I originally thought. Thus this new and more accurate bug report.

So I suspect there is a bug in UP when computing the hash value and equality test of grounded actions with simulated effects. At the moment it is preventing the up-skdecide engine from working properly, especially with the scikit-decide's RL engine which allows UP problems to be solved by using Reinforcement Learning. Indeed, this engine relies on grounded actions.

Below is a minimal code example that you can run and that shows that UP simulated action effects prevent grounded actions from being properly compared:

from unified_planning.shortcuts import *
from unified_planning.environment import get_environment
from unified_planning.engines.compilers.grounder import GrounderHelper
from unified_planning.engines.sequential_simulator import UPSequentialSimulator

Location = UserType("Location")
Robot = UserType("Robot")

at = Fluent("at", Location, robot=Robot)
battery_charge = Fluent("battery_charge", IntType(0, 100), robot=Robot)

move = InstantaneousAction("move", robot=Robot, l_from=Location, l_to=Location)
robot = move.parameter("robot")
l_from = move.parameter("l_from")
l_to = move.parameter("l_to")
move.add_precondition(Equals(at(robot), l_from))
move.add_precondition(GE(battery_charge(robot), 10))
move.add_precondition(Not(Equals(l_from, l_to)))
move.add_effect(at(robot), l_to)

def fun(problem, state, actual_params):
    value = state.get_value(battery_charge(actual_params.get(robot))).constant_value()
    return [Int(value - 10)]

# COMMENT IN THE FOLLOWING LINE AND THE LAST ASSERT WILL WORK
move.set_simulated_effect(SimulatedEffect([battery_charge(robot)], fun))

l1 = Object("l1", Location)
l2 = Object("l2", Location)
r1 = Object("r1", Robot)

problem = Problem("robot_with_simulated_effects")
problem.add_fluent(at)
problem.add_fluent(battery_charge)
problem.add_action(move)
problem.add_object(l1)
problem.add_object(l2)
problem.add_object(r1)

problem.set_initial_value(at(r1), l1)
problem.set_initial_value(battery_charge(r1), 100)

simulator = UPSequentialSimulator(problem)

all_actions = set(
    [(a[0], a[1]) for a in GrounderHelper(problem).get_grounded_actions()]
)

app_actions = set(
    [a for a in simulator.get_applicable_actions(simulator.get_initial_state())]
)

# FOLLOWING ASSERT ALWAYS WORKS WITH OR WITHOUT SIMULATED EFFECT
assert app_actions.issubset(all_actions)

all_grounded_actions = set(
    [a[2] for a in GrounderHelper(problem).get_grounded_actions()]
)

app_grounded_actions = set(
    [
        simulator._ground_action(a[0], a[1])
        for a in simulator.get_applicable_actions(simulator.get_initial_state())
    ]
)

# FOLLOWING ASSERT ONLY WORKS WITHOUT SIMULATED EFFECT
assert app_grounded_actions.issubset(all_grounded_actions)

If you comment in the line move.set_simulated_effect(SimulatedEffect([battery_charge(robot)], fun)) you will see that the last assert runs fine. It shows that the issue is related to using UP simulated action effects.

Do you confirm this is a bug, and if so, would it be please possible to correct it as soon as you can so that I can push working up-skdecide RL-based engine in theAIPlan4EU project?

Thank you in advance for looking into this!

alvalentini commented 8 months ago

Hi @fteicht! The problem is that the python function associated to the grounded simulated effect generated by the GrounderHelper (used to compute the all_grounded_actions) is not the same as the one generated internally by the simulator. It is not clear to me how to implement an equality that is able to understand that two different python functions are semantically equivalent.

A way that I see to fix your code is to use the same GrounderHelper, in this way:

from unified_planning.shortcuts import *
from unified_planning.environment import get_environment
from unified_planning.engines.compilers.grounder import GrounderHelper
from unified_planning.engines.sequential_simulator import UPSequentialSimulator

Location = UserType("Location")
Robot = UserType("Robot")

at = Fluent("at", Location, robot=Robot)
battery_charge = Fluent("battery_charge", IntType(0, 100), robot=Robot)

move = InstantaneousAction("move", robot=Robot, l_from=Location, l_to=Location)
robot = move.parameter("robot")
l_from = move.parameter("l_from")
l_to = move.parameter("l_to")
move.add_precondition(Equals(at(robot), l_from))
move.add_precondition(GE(battery_charge(robot), 10))
move.add_precondition(Not(Equals(l_from, l_to)))
move.add_effect(at(robot), l_to)

def fun(problem, state, actual_params):
    value = state.get_value(battery_charge(actual_params.get(robot))).constant_value()
    return [Int(value - 10)]

# COMMENT IN THE FOLLOWING LINE AND THE LAST ASSERT WILL WORK
move.set_simulated_effect(SimulatedEffect([battery_charge(robot)], fun))

l1 = Object("l1", Location)
l2 = Object("l2", Location)
r1 = Object("r1", Robot)

problem = Problem("robot_with_simulated_effects")
problem.add_fluent(at)
problem.add_fluent(battery_charge)
problem.add_action(move)
problem.add_object(l1)
problem.add_object(l2)
problem.add_object(r1)

problem.set_initial_value(at(r1), l1)
problem.set_initial_value(battery_charge(r1), 100)

simulator = UPSequentialSimulator(problem)
grounder = GrounderHelper(problem)

all_actions = set(
    [(a[0], a[1]) for a in grounder.get_grounded_actions()]
)

app_actions = set(
    [a for a in simulator.get_applicable_actions(simulator.get_initial_state())]
)

# FOLLOWING ASSERT ALWAYS WORKS WITH OR WITHOUT SIMULATED EFFECT
assert app_actions.issubset(all_actions)

all_grounded_actions = set(
    [a[2] for a in grounder.get_grounded_actions()]
)

app_grounded_actions = set(
    [
        grounder.ground_action(a[0], a[1])
        for a in simulator.get_applicable_actions(simulator.get_initial_state())
    ]
)

# FOLLOWING ASSERT ONLY WORKS WITHOUT SIMULATED EFFECT
assert app_grounded_actions.issubset(all_grounded_actions)

fteicht commented 7 months ago

Thanks a lot @alvalentini for your support! I made your suggested changes and now the scikit-decide's unified planning wrapper works perfectly with simulated action effects.

I have updated the up-skdecide repository as well to reflect this change (please note that it requires the latest commit of scikit-decide to work, but not the latest PyPi package until we generate a new tagged version of scikit-decide).

Concerning this issue, I believe it only concerns advanced usage of the library. I am happy to close the issue, but maybe you might want to write a warning somewhere in the document to avoid advanced users to fall in the same trap?

Thanks very much again!

aiplan4eu / unified-planning

Simulated action effects break grounded action comparison #585