Closed mudbri closed 1 year ago
def enhancedMinimization(Program P):
signatureBuckets = {}
minimizedProgram = empty program
for rule in P:
signature = getSignature(rule)
signatureBuckets[signature].add(rule)
for bucket in signatureBuckets:
for all pair of rules (r_1, r_2) in bucket:
unify(r_1,r_2)
add all unified rules in bucket to minimizedProgram
return minimizedProgram
Implemented in commit 7bff052fbfba1b2274888712b1f986407b5642fa
We have the standard datalog minimization from the "Optimizing Datalog Programs" paper, Faure-log minimization, and our own rule unification algorithm to merge two rules into one rule. Now, we want to figure out how to integrate Rule Unification with Datalog Minimization. The questions are: (1) When to execute rule unification, (2) How to optimally execute it (to avoid redundant computation) and (3) When to end the minimization algorithm. Current ideas:
(1) We should execute rule unification after the Faurelog minimization. This is because during rule unification, we make all conditions global. After doing so, it is unclear how we can execute Faurelog minimization (2) We need to test all rules that can be unified. To do so, we can assign a signature to each rule, where the same signature means that it might be possible to combine the two rules. The signature should depend on (1) The head having the same table and the same constants in the same place and (2) The tables in the body being the same. Once we have the signature for all rules (O(n) process, where we have n number of rules), we can place all rules with the same signature in the same bucket. Thereafter, all rules in the same bucket are checked against each other for possible unification. (3) The algorithm should stop once for every bucket, all rules in the bucket have been considered for unification for all other rules in the bucket.