karmaresearch / vlog

Apache License 2.0
55 stars 9 forks source link

Applicability of Rules Depend on their Order in the Rule File #64

Open smennicke opened 3 years ago

smennicke commented 3 years ago

I have the instance t.csv:

a,b

and the rules in file rules:

fact(X,Y,c) :- t(X,Y)

body(N1,N2) :- rule4(X,N1), derived1(X,N2)
head(N1,N2) :- rule4(X,N1), derived1(X,N2), e(N1,N), e(N, N2)

rule1(X,N) :- fact(X,Y,N)
derived1(X,N1) :- rule1(X,N)
e(N1,N), e(N,N2) :- rule1(X,N1), derived1(X,N2)

rule2(X,Y,N) :- fact(X,Y,N)
derived2(Y,X,N1) :- rule2(X,Y,N)
e(N1,N), e(N,N2) :- rule2(X,Y,N1), derived2(Y,X,N2)

rule3(X,Y,N) :- derived2(X,Y,N)
fact(Y,X,N1) :- rule3(X,Y,N)
e(N1,N), e(N,N2) :- rule3(X,Y,N1), fact(Y,X,N2)

rule4(Y,N) :- derived2(X,Y,N)
derived1(X,N1) :- rule4(X,N)
e(N1,N), e(N,N2) :- rule4(X,N1), derived1(X,N2)

The three-rule blocks (starting with rule1-4) create some facts (derived1 and derived2) and create a graph structure (predicate e). Running vlog in version 6f9d199 shows wrong behavior on the very last. Although vlog recognizes a match for the body of the rule, it does not derive any new tuples. To show this behavior, we have included the following auxiliary rules:

body(N1,N2) :- rule4(X,N1), derived1(X,N2)
head(N1,N2) :- rule4(X,N1), derived1(X,N2), e(N1,N), e(N, N2)

While vlog has a match for body and materializes it, it does not derive any tuples in head. The expected behavior can be observed when moving the last rule in file rules to an earlier position in the file, e.g., rules2:

fact(X,Y,c) :- t(X,Y)
body(N1,N2) :- rule4(X,N1), derived1(X,N2)
head(N1,N2) :- rule4(X,N1), derived1(X,N2), e(N1,N), e(N, N2)

rule1(X,N) :- fact(X,Y,N)
derived1(X,N1) :- rule1(X,N)
e(N1,N), e(N,N2) :- rule1(X,N1), derived1(X,N2)

rule2(X,Y,N) :- fact(X,Y,N)
derived2(Y,X,N1) :- rule2(X,Y,N)
e(N1,N), e(N,N2) :- rule2(X,Y,N1), derived2(Y,X,N2)

e(N1,N), e(N,N2) :- rule4(X,N1), derived1(X,N2)

rule3(X,Y,N) :- derived2(X,Y,N)
fact(Y,X,N1) :- rule3(X,Y,N)
e(N1,N), e(N,N2) :- rule3(X,Y,N1), fact(Y,X,N2)

rule4(Y,N) :- derived2(X,Y,N)
derived1(X,N1) :- rule4(X,N)

With the first rule file, there are 14 derivations and with the second one, there are 17. The auxiliary rules indicate the second one to be actually sound and complete. Unfortunately, I was not able to reduce the example any further. If one of the three-rule blocks unrelated to this last rule is missing, the error is not observable.

I used the following call string: vlog mat --storemat_path . --storemat_format csv --decompressmat 1 --rules [RULEFILENAME], where [RULEFILENAME] ranges over by rules and rules2. I've been using edb.conf:

EDB0_predname=t
EDB0_type=INMEMORY
EDB0_param0=.
EDB0_param1=t

as configuration file.

CerielJacobs commented 3 years ago

I think the problem is in the filtering of the restricted chase, which does not seem to work correctly with multihead rules that share an existential variable. For now, you can work around it by using the "--rewriteMultihead true" option for vlog.

CerielJacobs commented 1 year ago

I have "fixed" this issue by always rewriting existential multi-head rules.

larry-gonzalez commented 1 year ago

Hi. If by "always rewriting existential multi-head rules" you mean always executing this function, then please be aware that it might affect termination of some existential rule programs when executing the restricted chase

CerielJacobs commented 1 year ago

Hi Larry, can you elaborate on that? Because yes, that is what I mean. For instance, the rule

e(N1,N), e(N,N2) :- rule1(X,N1), derived1(X,N2)

would be rewritten to:

Generated_Head_0(N1, N2, N) :- rule1(X,N1), derived(X, N2) e(N1, N) :- Generated_Head_0(N1, N2, N) e(N, N2) :- __Generated_Head_0(N1, N2, N)

Do you mean that such a transformation is not correct?

On 22 Nov 2022, at 16:03, Larry González @.***> wrote:

Hi. If by "always rewriting existential multi-head rules" you mean always executing this function, then please be aware that it might affect termination of some existential rule programs when executing the restricted chase

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.

larry-gonzalez commented 1 year ago

The transformation is correct, but it affects termination of the restricted chase

Let P be a existential rule program, and P' its transformation with single-headed rules. There are some cases in which P terminates for all strategies, but P' does not.

As an example let's consider the program

A(c) .
R(y,x) :- R(x,y) .
R(x,y),A(y) :- A(x) .

The restricted chase sequence would be then

A(c)
A(c)  R(c,n1)
A(n1)
A(c)  R(c,n1)
A(n1) R(n1,c)

which terminates. On the other hand, the transformed existential rule program would be like:

A(c) .
R(y,x) :- R(x,y) .
fresh(x,y) :- A(x) .
R(x,y) :- fresh(x,y) . 
A(y) :- fresh(x,y) . 

And its restricted chase sequence would be something like:

A(c)
A(c)  fresh(c,n1)
A(c)  fresh(c,n1)  R(c,n1)
A(c)  fresh(c,n1)  R(c,n1)
                   R(n1,c)
A(c)  fresh(c,n1)  R(c,n1)
A(n1)              R(n1,c)
...

which is non-terminating.

Essentially the problem is that the freshly introduced Datalog rules propagate every null because they are not restricted in anyway (because they are Datalog). This has been mentioned by Krötzsch here and by Carral et al. here (Theorem 28). I am sure there are other references but I don't have them at the top of my head

PD. @irina-dragoste provided the example, :)

irina-dragoste commented 1 year ago

Basically, rules with multiple head atoms that would be satisfied in the restricted chase are no longer satisfied, as the fresh predicate in the rewriting is not produced elsewhere. Here is another example ruleset with a more complex multi-head that loses termination if re-writen.

e(x,z),e(y,z) : - a(x), a(y) .
a(y) :- e(x,y) .
e(y,x) :- e(x,y) .
CerielJacobs commented 1 year ago

Would also adding a Datalog rule

fresh(x,y) :- R(x,y) A(y)

resolve this?

On 22 Nov 2022, at 17:44, Larry González @.***> wrote:

The transformation is correct, but it affects termination of the restricted chase

Let P be a existential rule program, and P' its transformation with single-headed rules. There are some cases in which P terminates for all strategies, but P' does not.

As an example let's consider the program

A(c) . R(y,x) :- R(x,y) . R(x,y),A(y) :- A(x) .

The restricted chase sequence would be then

A(c)

A(c) R(c,n1) A(n1)

A(c) R(c,n1) A(n1) R(n1,c)

which terminates. On the other hand, the transformed existential rule program would be like:

A(c) . R(y,x) :- R(x,y) . fresh(x,y) :- A(x) . R(x,y) :- fresh(x,y) . A(y) :- fresh(x,y) .

And its restricted chase sequence would be something like:

A(c)

A(c) fresh(c,n1)

A(c) fresh(c,n1) R(c,n1)

A(c) fresh(c,n1) R(c,n1) R(n1,c)

A(c) fresh(c,n1) R(c,n1) A(n1) R(n1,c) ...

which is non-terminating.

Essentially the problem is that the freshly introduced Datalog rules propagate every null because they are not restricted in anyway (because they are Datalog). This has been mentioned by Markus here and by David et al. here. I am sure there are other references by I don't have them at the top of my head

PD. @irina-dragoste provided the example, :)

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.

larry-gonzalez commented 1 year ago

I don't think so (maybe I interpreted your suggestion wrongly)

The problem is that rule2 -in the following code chunk- is not blockable

Let's consider the program

rule0: A(c) .
rule1: R(y,x) :- R(x,y) .
rule2: fresh(x,y) :- A(x) .
rule3: R(x,y) :- fresh(x,y) .
rule4: A(y) :- fresh(x,y) .
rule5: fresh(x,y) :- R(x,y), A(y) .

here rule5 has a trigger only after fireing rule3 and rule4, which necessarily creates a new atom to fire rule2, repeting the cicle. Furthermore, the trigger for rule5 is not active

But this is considering only this particular example. It would still remain to discuss how this (new) transformation would be in the general case

CerielJacobs commented 1 year ago

Ok, apparently I don’t understand how the restricted chase works in the multi-headed case. How is it that fresh(x,y) :- A(x) is not blockable but R(x,y),A(y) :- A(x) is?

On 23 Nov 2022, at 12:45, Larry González @.***> wrote:

I don't think so (maybe I interpreted your suggestion wrongly)

The problem is that rule2 -in the following code chunk- is not blockable

Let's consider the program

rule0: A(c) . rule1: R(y,x) :- R(x,y) . rule2: fresh(x,y) :- A(x) . rule3: R(x,y) :- fresh(x,y) . rule4: A(y) :- fresh(x,y) . rule5: fresh(x,y) :- R(x,y), A(y) .

here rule5 has a trigger only after fireing rule3 and rule4, which necessarily creates a new atom to fire rule2, repeting the cicle. Furthermore, the trigger for rule5 is not active

But this is considering only this particular example. It would still remain to discuss how this (new) transformation would be in the general case

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.

irina-dragoste commented 1 year ago

The problem is that rule2 -in the following code chunk- is not blockable here rule5 has a trigger only after fireing rule3 and rule4, which necessarily creates a new atom to fire rule2, repeting the cicle.

This is actually not true. Rule2 can be applied only after all Datalog rules (including rule5) have been saturated, creating {A(c), fresh(c,null), R(c,null), A(null), R(null,c), fresh(null,c)}. The last fact satisfies rule2 with x=null, and the restricted chase terminates.

irina-dragoste commented 1 year ago

Would also adding a Datalog rule fresh(x,y) :- R(x,y) A(y) resolve this?

Yes, here Carral et al. here propose the 2-way atomic decomposition (Definition 32) that preserves Datalog-first restricted chase termination (Theorem 37). This 2-way atomic decomposition solution to the multi-head bug would preserve termination, but potentially slow down reasoning, as it introduces new rules. Moreover, and the multiple-atom head of the original rule becomes a multiple-atom body of a new rule, which may require an expensive join.

It would be ideal to solve the multi-head bug without this workaround or rewriting rules. But if this is too difficult or time consuming to solve, I guess the 2-way atomic decomposition is the best solution

larry-gonzalez commented 1 year ago

This is actually not true...

Thanks for pointing this out. I did not considered Datalog-first restricted chase, which VLog implements

Would also adding a Datalog rule fresh(x,y) :- R(x,y) A(y) resolve this?

Yes. It would

CerielJacobs commented 1 year ago

Thank you for the clarification! It seems I’m re-inventing the wheel... I would like to solve the multi-head bug without the workaround, but at the moment don’t know how to do that. So, for now, I’ll implement the "2-way atomic decomposition”.

On 23 Nov 2022, at 16:00, Irina Dragoste @.***> wrote:

Would also adding a Datalog rule fresh(x,y) :- R(x,y) A(y) resolve this? Yes, here Carral et al. here propose the 2-way atomic decomposition (Definition 32) that preserves Datalog-first restricted chase termination (Theorem 37). So this 2-way atomic decomposition solution to the multi-head bug would preserve termination, but potentially slow down reasoning, as it introduces new rules. It would be ideal to solve the multi-head bug without this workaround or rewriting rules. But if this is too difficult or time consuming to solve, I guess the 2-way atomic decomposition is the best solution

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.

irina-dragoste commented 1 year ago

FYI, with the above commits, reasoning is now correct.