Refining/constraining predicates based on combination of fields

AbdallahS commented 5 months ago

Is there a mechanism to impose constraints on the predicate sub-class based on combinations of fields?

In the following example, I would like any predicate intance that unifies with the Data class to be such that if the info type is "height" then the value is of type int and if the info type is "firstname" then the value is of type str. I would like the user3 facts not to unify with Data. A suitable solution to my problem would let me avoid the additional filtering condition in function print_data because the querying mechanism should have already filtered out the bad data.

infotype(height;firstname).
person(user1;user2;user3;user4).

data(user1,height,170).
data(user1,firstname,"Anna").
data(user2,height,180).
data(user2,firstname,"Robert").
data(user3,height,"Charlie").
data(user3,firstname,175).
data(user4,firstname,"Richard").

missing(USER,INFO) :- person(USER), infotype(INFO), not data(USER,INFO,_).

#show missing/2.

#script(python)

import clorm
import typing
from clorm.clingo import Control

InfotypeField = clorm.refine_field(clorm.ConstantField,["height","firstname"])

class Data(clorm.Predicate):
   user: clorm.ConstantStr
   type: clorm.ConstantStr = clorm.field(InfotypeField)
   value: typing.Union[int,str]

def print_data(model):
    query=model.facts(atoms=True).query(Data)
    for d in query.all():
        if (d.type == "height" and type(d.value) == int)\
        or (d.type == "firstname" and type(d.value) == str):
            print(d)
def main(ctrl_):
    ctrl = Control(control_=ctrl_, unifier=[Data])
    ctrl.ground([("base",[])])
    ctrl.solve(on_model=print_data)

#end.

I know that I could turn data to an arity-2 predicate and turn height and firstname into arity-1 function symbols. For example data(user1,height(170)). data(user1,firstname("Anna")). This would let me define a suitably constrained Data class directly, but it is undesirable for other reasons (e.g., it messes up the missing/2 rule).

daveraja commented 5 months ago

Clorm currently doesn't have a way to specify conditions between fields. It could be a useful feature to add and I could imagine having a specially named class member function that checks some user-specified condition.

Unfortunately, for the moment my only thought would be to define two distinct classes:

class DataHeight(clorm.Predicate, name="data"):
   user: clorm.ConstantStr
   type: clorm.ConstantStr = clorm.refine_field(clorm.ConstantField, ["height"])
   value: int

class DataFirstName(clorm.Predicate, name="data"):
   user: clorm.ConstantStr
   type: clorm.ConstantStr = clorm.refine_field(clorm.ConstantField, ["firstname"])
   value: str

On the positive side this would only unify with the facts that you require and the user3 fact would fail to unify in this case. However, the negative is that you would have to treat the two classes separately when querying.

AbdallahS commented 5 months ago

Thank you for the suggestion. I can reduce a little bit the pain of having two classes when querying by taking their union at the cost of prefixing my predicates.

meta(infotype(height;firstname)).
meta(person(user1;user2;user3;user4)).

meta(data(user1,height,170)).
meta(data(user1,firstname,"Anna")).
meta(data(user2,height,180)).
meta(data(user2,firstname,"Robert")).
meta(data(user3,height,"Charlie")).
meta(data(user3,firstname,175)).
meta(data(user4,firstname,"Richard")).

meta(missing(USER,INFO)) :- meta(person(USER)), meta(infotype(INFO)), not meta(data(USER,INFO,_)).

#show.
#show missing(U,I) : meta(missing(U,I)).

#script(python)

import clorm
import typing
from clorm.clingo import Control

InfotypeField = clorm.refine_field(clorm.ConstantField,["height","firstname"])

class DataHeight(clorm.Predicate, name="data"):
   user: clorm.ConstantStr
   type: clorm.ConstantStr = clorm.refine_field(clorm.ConstantField, ["height"])
   value: int

class DataFirstName(clorm.Predicate, name="data"):
   user: clorm.ConstantStr
   type: clorm.ConstantStr = clorm.refine_field(clorm.ConstantField, ["firstname"])
   value: str

class Data(clorm.Predicate, name="meta"):
   data: typing.Union[DataHeight,DataFirstName]

def print_data(model):
    query=model.facts(atoms=True).query(Data)
    for m in query.all():
        d = m.data
        if (d.type == "height" and type(d.value) == int)\
        or (d.type == "firstname" and type(d.value) == str):
            print(d)

def main(ctrl_):
    ctrl = Control(control_=ctrl_, unifier=[Data])
    ctrl.ground([("base",[])])
    ctrl.solve(on_model=print_data)

#end.

I'm not sure what would be the best approach to make these sort of things more direct, but it feels like the union not being first class is one element that gets in the way. For example, I can replace data: DataHeight with data: typing.Union[DataHeight,DataFirstName] but I cannot replace query=model.facts(atoms=True).query(DataHeight) with query=model.facts(atoms=True).query(typing.Union[DataHeight,DataFirstName]) because of the following error: TypeError: Invalid argument typing.Union[__main__.DataHeight, __main__.DataFirstName] (type: <class 'typing._UnionGenericAlias'>): expecting either a PredicatePath, a Predicate sub-class, or a PredicatePath.Hashable (on Python 3.10).

daveraja commented 5 months ago

That's a neat trick with using a meta predicate. But having to modify the encoding to get around Clorm limitations is not ideal.

Yes, union is not a first class element in the query. There are a bunch of features that would be nice to add to the query mechanism but would need substantial rewriting to get around limitations of the current implementation. Adding a way to constrain the unification based on the combination of values would be a simpler and more self-contained task.

AbdallahS commented 5 months ago

Agreed. First-class union is overkill for the usecase in this issue.

Drawing inspiration from the refine_field function, I'm wondering if passing a filter when subclassing Predicate would fit together. The filter would be an optional callable (if none is provided, then the behaviour is identical to lambda x: True) that could be invoked automatically as the last step of a unification attempt, once all the attributes have successfully pattern-matched. The user could then write code as follows for this issue's example.

class Data(clorm.Predicate, lambda d: (d.type == "height" and type(d.value) == int) or (d.type == "firstname" and type(d.value) == str)):
   user: clorm.ConstantStr
   type: clorm.ConstantStr = clorm.refine_field(clorm.ConstantField, ["height", "firstname"])
   value: typing.Union[int,str]

Was that the sort of thing you had in mind?

daveraja commented 4 months ago

Yes, something like this. Adding it to the class signature is one option. Something like the way dataclass has a __post_init__ hook function is another option:

@dataclass
class X:
     a: int
     b: int

    def __post_init__(self):
          # do something

potassco / clorm

Refining/constraining predicates based on combination of fields #139