karmaresearch / vlog

Apache License 2.0
55 stars 9 forks source link

Wrong RMFA classification 2 #85

Closed irina-dragoste closed 2 years ago

irina-dragoste commented 3 years ago

R(x,y),D(y) :- C(x) . S(x,y), E(y) :- D(x) . V(x,y), C(y) :- E(x) . R(z,x) :- S(x,y), V(y,z) .

This rule set should be classified as RMFA, but the current code does not.

CerielJacobs commented 2 years ago

@irina-dragoste are you sure that this rule-set is RMFA? I'm looking at the code for days now, and I've been trying it by hand, but get a cyclic term. I'm probably mis-understanding something.

CerielJacobs commented 2 years ago

Here is a summary of my by-hand attempt.

0: R(X,Y), D(Y) :- C(X) 1: S(X,Y), E(Y) :- D(X) 2: V(X,Y), C(Y) :- E(X) 3: R(Z,X) :- S(X,Y), V(Y,Z)

Rules 0, 1, and 2 are generating.

Critical instance: { R(*,*), D(*), C(*), S(*,*), E(*), V(*,*)}

Materialization on the critical instance:

Applying rule 3 gives nothing new.

Applying rule 0: Since we have C(*), is it blocked with X->*? Let's see. Using a fresh constant c1: X->c1, compute R_datalog(C(c1)). Not blocking. Gives additional facts R(*,f0(*)) and D(f0(*)) where f0 is the skolem function for variable Y in rule 0.

Applying rule 3 gives nothing new.

Applying rule 1: Since we have D(*): blocked with X->*? No. Gives additional facts S(*,f1(*)) and E(f1(*)) where f1 is the skolem function for variable Y in rule 1.

Since we have D(f0(*)): blocked with X->f0(*)? Compute R_datalog({D(f0(c1)), R(c1,f0(c1)), C(c1)}) Not blocking. Gives additional facts S(f0(*),f1(f0(*))) and E(f1(f0(*)))

Applying rule 3 gives nothing new.

Applying rule 2: Since we have E(*), is rule 2 blocked with X->*? No. Gives additional facts V(*,f2(*)) and C(f2(*)) where f2 is the skolem function for variable Y in rule 2.

Since we have E(f1(*)): is rule 2 blocked with X->f1(*)? Compute R_datalog({E(f1(c1), S(c1,f1(c1)), D(c1)}) Not blocking. Gives additional facts V(f1(*),f2(f1(*))), C(f2(f1(*)))

Since we have E(f1(f0(*))): is rule 2 blocked with X->f1(f0(*))? Compute R_datalog({E(f1(f0(c1))), S(f0(c1),f1(f0(c1))), D(f0(c1)), R(c1,f0(c1)), C(c1)}). Not blocking. Gives additional facts V(f1(f0(*)),f2(f1(f0(*)))) and C(f2(f1(f0(*)))).

Applying rule 3: we now have a.o. S(*,*), S(*,f1(*)), S(f0(*),f1(f0(*))), V(*,f2(*)), V(f1(*),f2(f1(*))), V(f1(f0(*)),f2(f1(f0(*)))) Giving: R(f2(f1(*)),*), R(f2(f1(f0(*))),f0(*)), R(f2(*),*)

At this stage, we have the following facts:

R(*,*), R(*,f0(*)), R(f2(f1(*)),*), R(f2(f1(f0(*))),f0(*)), R(f2(*),*) D(*), D(f0(*)) C(*), C(f2(*)), C(f2(f1(*))), C(f2(f1(f0(*)))) S(*,*), S(*,f1(*)), S(f0(*),f1(f0(*))) E(*), E(f1(*)), E(f1(f0(*))) V(*,*), V(*,f2(*)), V(f1(*),f2(f1(*))), V(f1(f0(*)),f2(f1(f0(*))))

Applying rule 0 again: Since we have C(f2(*)), is rule 0 blocked with X->f2(*)? Compute R_datalog({C(f2(c1)), V(c1, f2(c1)), E(c1)}) Not blocking. Gives additional facts R(f2(*),f0(f2(*))), D(f0(f2(*)))

....

Applying rule 3 gives nothing new.

Applying rule 1 again: Since we have D(f0(f2(*))), is rule 1 blocked with X->f0(f2(*))? Compute R_datalog({D(f0(f2(c1))), C(f2(c1)) R(f2(c1),f0(f2(c1))),E(c1),V(2,f2(c1))}) Not blocking. Gives additional facts: S(f0(f2(*)),f1(f0(f2(*)))), E(f1(f0(f2(*))))

...

Applying rule 2 again: Since we have E(f1(f0(f2(*)))), is rule 2 blocked with X->f1(f0(f2(*)))? Compute R_datalog({E(f1(f0(f2(c1)))), S(f0(f2(c1)),f1(f0(f2(c1)))), D(f0(f2(c1))), R(f2(c1),f0(f2(c1))), C(f2(c1)), V(c1,f2(c1)), E(c1)} Not blocking. Gives additional facts: V(f1(f0(f2(*))),f2(f1(f0(f2(*))))), C(f2(f1(f0(f2(*))))) Circular terms.

So, if the rule-set is RMFA, I have a mistake somewhere.

irina-dragoste commented 2 years ago

I am sorry, you are right, and your by-hand attempt is correct. This example is not RMFA. (This is an example where the restricted chase would actually terminate on any given instance, because it will not produce cycles with depth more than 1, but no cycles are accepted by the RFMA definition.)

An observation that might impact performance, as I see that in your hand-written example, you usually attempt to apply rule 3 before others: when checking acyclicity, we can attempt any application order (as long as the rule and substitution pair satisfy the conditions imposed by the acyclicity notions), and would obtain the same graph/fact set. So we don't need to always saturate Datalog rules first.