vmware / differential-datalog

DDlog is a programming language for incremental computation. It is well suited for writing programs that continuously update their output in response to input changes. A DDlog programmer does not write incremental algorithms; instead they specify the desired input-output mapping in a declarative manner.
MIT License
1.37k stars 117 forks source link

Programs should not be always evaluated in the starting state #221

Open mihaibudiu opened 5 years ago

mihaibudiu commented 5 years ago

At least not in a decent time.

import intern
import souffle_lib
import souffle_types
relation Rdummy(x: Tnumber)
Rdummy(0).
typedef TVar = IString
relation Rconf(_num:Tnumber)
Rconf((_x + (1:bit<32>))) :- Rconf(_x), not Rlim(_x).
Rconf((0:bit<32>)).
relation Rlim(_x:Tnumber)
input relation Rlim_shadow(_x:Tnumber)
Rlim(_x) :- Rlim_shadow(_x).
relation Rvar_index(_var:TVar, _index:Tnumber)
Rvar_index(string_intern("a"), (0:bit<32>)).
Rvar_index(string_intern("b"), (1:bit<32>)).
Rvar_index(string_intern("c"), (2:bit<32>)).
relation Rtranslation(_conf:Tnumber, _var:TVar, _truthy:Tnumber)
Rtranslation(_n, _var, (1:bit<32>)) :- Rvar_index(_var, _k), Rconf(_n), ((((pow32((2:bit<32>), _k)) & _n)) != (0:bit<32>)).
Rtranslation(_n, _var, (0:bit<32>)) :- Rvar_index(_var, _k), Rconf(_n), ((((pow32((2:bit<32>), _k)) & _n)) == (0:bit<32>)).
relation Rcircuit_and(_result:TVar, _op1:TVar, _op2:TVar)
input relation Rcircuit_and_shadow(_result:TVar, _op1:TVar, _op2:TVar)
Rcircuit_and(_result, _op1, _op2) :- Rcircuit_and_shadow(_result, _op1, _op2).
relation Rcircuit_or(_result:TVar, _op1:TVar, _op2:TVar)
input relation Rcircuit_or_shadow(_result:TVar, _op1:TVar, _op2:TVar)
Rcircuit_or(_result, _op1, _op2) :- Rcircuit_or_shadow(_result, _op1, _op2).
relation Rcircuit_not(_result:TVar, _op1:TVar)
input relation Rcircuit_not_shadow(_result:TVar, _op1:TVar)
Rcircuit_not(_result, _op1) :- Rcircuit_not_shadow(_result, _op1).
relation Rcircuit_result(_result:TVar)
input relation Rcircuit_result_shadow(_result:TVar)
Rcircuit_result(_result) :- Rcircuit_result_shadow(_result).
relation Revaluate(_conf:Tnumber, _var:TVar, _truthy:Tnumber)
Revaluate(_conf, _var, _value) :- Rtranslation(_conf, _var, _value).
Revaluate(_conf, _var, (_v1 & _v2)) :- Rcircuit_and(_var, _op1, _op2), Revaluate(_conf, _op1, _v1), Revaluate(_conf, _op2, _v2).
Revaluate(_conf, _var, (_v1 | _v2)) :- Rcircuit_or(_var, _op1, _op2), Revaluate(_conf, _op1, _v1), Revaluate(_conf, _op2, _v2).
Revaluate(_conf, _var, lnot(_v1)) :- Rcircuit_not(_var, _op1), Revaluate(_conf, _op1, _v1).
output relation Rsatisfy(_conf:Tnumber, _var:TVar)
Rsatisfy(_conf, _inputvar) :- Rcircuit_result(_var), Revaluate(_conf, _var, (1:bit<32>)), Rvar_index(_inputvar, _), Rtranslation(_conf, _inputvar, (1:bit<32>)).

for this input file:

start;
insert Rlim_shadow(8),
insert Rcircuit_and_shadow("x","a","b"),
insert Rcircuit_or_shadow("y","x","c"),
insert Rcircuit_not_shadow("z","y"),
insert Rcircuit_result_shadow("z"),
commit;
dump Rsatisfy;
exit;
ryzhyk commented 5 years ago

In the initial state rconf will contain 2 ^32 values which is probably why it takes so long

mihaibudiu commented 5 years ago

Lim is 8

ryzhyk commented 5 years ago

Only after the first transaction

mihaibudiu commented 5 years ago

This is bad. I wonder whether you can start evaluation only at the first commit.

ryzhyk commented 5 years ago

Then tests that don't make any commits will fail.

mihaibudiu commented 5 years ago

We could add a flag to ddlog to differentiate between these two cases. Frankly, for an incremental engine that case is rather extreme

mihaibudiu commented 5 years ago

I have renamed this issue; we should give users the option not to evaluate programs in the initial state.

convolvatron commented 3 years ago

this has interactions with d3log also 1) every instance performs this evaluation whether or not its started late 2) initial evaluations that contain locality may have additional correctness problems

i dont know that we have to change this, just that its making me uncomfortable

Kixiron commented 3 years ago

I don't think it matters, each instance has the same starting state and as long as timestamps are obeyed the state of all nodes will become consistent