Open GoogleCodeExporter opened 9 years ago
Original comment by rodrigob...@gmail.com
on 7 Feb 2014 at 1:31
Hi Rodrigo,
I've gone over the current implementation in detail and have noted a couple of
discrepancies in
your pseudo-code description. I believe the following pseudo-code more
accurately reflects
what is currently going on:
ExhaustiveRewriter and RewriteOnce:
while there are changes
for each rewriter R in list
if R(E) != E then replace E by R(E) end loop
for each sub-expression E' in E (in depth-first)
if R(E') != E' then replace E' by R(E'), end both "for each" loops
NOTE: That is it rewrites the top expression first (see AbstractExpression's
main replace() logic).
TotalRewriter:
function rewrite(E)
while there are changes
E = totalRewrite(E, Rewriters)
function totalRewrite(E, Rewriters) :
E = R(E) for R in Rewriters until no R makes a change // i.e. rewriters are exhaustively applied under the current context
if E changed
return E // NOTE: This is to do with how the current AbstractExpression's main
// replace() logic works with respect to handling top expressions
otherwise
for each sub-expression E' in E // This is the main loop in AbstractExpression.replace()
E' = totalRewrite(E', Rewriters)
replace old E' in E with new E'
NOTE: I believe this already close to what you were proposing and possibly more
efficient
due to exhaustively applying all rewriters on an expression within a context
before returning. Also, all approaches are depth first if I understand
your pseudo-code correctly.
Let me know what you think.
Original comment by ctjoreilly@gmail.com
on 7 Feb 2014 at 9:09
Hi Ciaran,
Thanks for checking. I think the main difference you found between my
pseudo-code and the existing code, for both the Exhaustive/RewriteOnce
combination and the existing TotalRewriter, is that they process the current
expression before descending. However, when I wrote "for every sub-expression E
(in depth-first)", I meant an expression is a sub-expression of itself (in the
way a set is a subset of itself). Taking this interpretation into account, I
think things coincide again, no? In any case, your description is a lot less
ambiguous.
(I notice too that I wrote "while there are no changes" when it should have
been "while there are changes" in the top two chunks of pseudo-code.)
Yes, what I am proposing is pretty close to the current TotalRewriter. And you
are right too that it's important to do an exhaustive rewrite of the top
expression before descending; this was an overlook on my part. Taking this into
account we get a proposal:
function newTotalRewriter(E, Rewriters) : // using a name now for recursion
previous = E
while E == previous
while there are changes
E = R(E) for R in Rewriters the first such that R(E) != E.
replace each *immediate* sub-expression E' of E by newTotalRewriter(E', Rewriters) (under extended context, of course)
Actually, to make things more uniform, we can drop the explicit "previous"
variable and write:
function newTotalRewriter(E, Rewriters) : // using a name now for recursion
while there are changes
while there are changes
E = R(E) for R in Rewriters the first such that R(E) != E.
replace each *immediate* sub-expression E' of E by newTotalRewriter(E', Rewriters) (under extended context, of course)
The only difference between my proposal and the current TotalRewriter is that
in the latter, if we rewrite E, we go back to the top and descend again,
extending the contexts all over again. This is the part that I think we can get
an improvement on.
You are right that this is due to the replace() logic.
We want a rewriter that invokes replace() but only goes down to the immediate
subexpressions (and then calls reduce() recursively there, which will of course
go to the immediate sub-expression's immediate sub-expressions, but the
function itself only accesses one level down of sub-expressions). However,
replace, by default, goes down the entire expression tree.
We have two options. First, we can use the pruning mechanism to make sure the
replacement is pruned on the first level. A second option is to write a special
purpose version of replace() that does the same thing but with less overhead. I
think we should try the first option first and only start writing special
purpose code (which will duplicate a lot of replace() functionality,
particularly the part that controls replaced subExpressionAndContext objects,
and will need to be checked whenever replace() is updated) if we have to.
Here's a beginning of how we can do it:
public static class NewTotalRewriter extends AbstractRewriter {
private Rewriter base;
public NewTotalRewriter(List<Rewriter> rewriters) {
base = new ExhaustiveRewriter(new RewriteOnce(rewriters));
// in the above, we want to replace RewriteOnce by a version that uses the rewriter condition decision tree we wrote a while back;
// actually it would be nice to have a DecisionTreeRewriteOnce class encapsulating that functionality for easy re-use.
}
private PruningPredicateMaker truePruningPredicateMaker = new PruningPredicateMaker() {
public PruningPredicate apply(Expression expression, PruningPredicate pruningPredicate, ExpressionAndContext subExpressionAndContext) {
return AbstractExpression.TRUE_PRUNING_PREDICATE;
}
};
private rewriteReplacementFunction = new RewriterReplacementFunction(this);
public Expression rewrite(Expression expression, RewritingProcess process) {
expression = base.rewrite(expression, process);
// Now, recurses into immediate sub-expressions with instructions to ignore top expression,
// which has already been taken care of, and a false (null) pruning predicate, so they do get processed, BUT
// with a pruner predicate *maker* that will send a *true* pruner to the next level, thus not going beyond the first level.
Expression result = expression.replace(
rewriteReplacementFunction,
null, // ReplacementFunctionMaker makeSpecificSubExpressionAndContextReplacementFunction,
null, // PruningPredicate prunePredicate,
truePruningPredicateMaker, // PruningPredicateMaker makeSpecificSubExpressionAndContextPrunePredicate,
false, // boolean onlyTheFirstOne,
true, // boolean ignoreTopExpression, ***** IMPORTANT, ALREADY TOOK CARE OF TOP EXPRESSION *****
false, // boolean replaceOnChildrenBeforeTopExpression,
null, // TernaryProcedure<Expression, Expression, RewritingProcess> listener,
process);
return result;
}
}
Admittedly weird-looking but I think it's about right. Please take a look
yourself and see if it makes sense and works. Let me know if you have any
questions.
Thanks!
Rodrigo
Original comment by rodrigob...@gmail.com
on 7 Feb 2014 at 11:06
Just realized it needs a small correction, an exhaustive loop around rewrite:
public Expression rewrite(Expression expression, RewritingProcess process) {
Expression original = expression;
while (expression != original) {
expression = base.rewrite(expression, process);
// Now, recurses into immediate sub-expressions with instructions to ignore top expression,
// which has already been taken care of, and a false (null) pruning predicate, so they do get processed, BUT
// with a pruner predicate *maker* that will send a *true* pruner to the next level, thus not going beyond the first level.
Expression result = expression.replace(
rewriteReplacementFunction,
null, // ReplacementFunctionMaker makeSpecificSubExpressionAndContextReplacementFunction,
null, // PruningPredicate prunePredicate,
truePruningPredicateMaker, // PruningPredicateMaker makeSpecificSubExpressionAndContextPrunePredicate,
false, // boolean onlyTheFirstOne,
true, // boolean ignoreTopExpression, ***** IMPORTANT, ALREADY TOOK CARE OF TOP EXPRESSION *****
false, // boolean replaceOnChildrenBeforeTopExpression,
null, // TernaryProcedure<Expression, Expression, RewritingProcess> listener,
process);
}
return result;
}
}
Original comment by rodrigob...@gmail.com
on 7 Feb 2014 at 11:10
More corrections: I had forgotten RewriteOnce goes down the tree. I meant a
rewriter that goes over a list of rewriters and I applies the first one to an
expression. I guess we don't have that, it could be called
TopExpressionOnlyRewriteOnce, with a DecisionTreeTopExpressionOnlyRewriteOnce
version.
Original comment by rodrigob...@gmail.com
on 7 Feb 2014 at 11:18
Also, last line should be return expression, not return result. :)
Original comment by rodrigob...@gmail.com
on 7 Feb 2014 at 11:18
Argh, more corrections:
public Expression rewrite(Expression expression, RewritingProcess process) {
Expression previous = null;
while (expression != previous) {
previous = expression;
expression = base.rewrite(expression, process);
// Now, recurses into immediate sub-expressions with instructions to ignore top expression,
// which has already been taken care of, and a false (null) pruning predicate, so they do get processed, BUT
// with a pruner predicate *maker* that will send a *true* pruner to the next level, thus not going beyond the first level.
Expression result = expression.replace(
rewriteReplacementFunction,
null, // ReplacementFunctionMaker makeSpecificSubExpressionAndContextReplacementFunction,
null, // PruningPredicate prunePredicate,
truePruningPredicateMaker, // PruningPredicateMaker makeSpecificSubExpressionAndContextPrunePredicate,
false, // boolean onlyTheFirstOne,
true, // boolean ignoreTopExpression, ***** IMPORTANT, ALREADY TOOK CARE OF TOP EXPRESSION *****
false, // boolean replaceOnChildrenBeforeTopExpression,
null, // TernaryProcedure<Expression, Expression, RewritingProcess> listener,
process);
}
return expression;
}
Original comment by rodrigob...@gmail.com
on 7 Feb 2014 at 11:20
Hi Rodrigo,
As regards:
"More corrections: I had forgotten RewriteOnce goes down the tree. I meant a
rewriter that goes over a list of rewriters and I applies the first one to an
expression. I guess we don't have that, it could be called
TopExpressionOnlyRewriteOnce, with a DecisionTreeTopExpressionOnlyRewriteOnce
version."
TotalRewriter already uses CallRewriterDecisionTree which I believe does as you
describe, i.e. uses the tree and returns the first rewritten expression.
Original comment by ctjoreilly@gmail.com
on 8 Feb 2014 at 12:33
Rodrigo,
A couple of additional observations/questions on your proposed solution:
1. As its stands the pruning mechanism proposed to only visit immediate
sub-expressions will actually prune those as well, this is due to this logic in
AbstractExpression:
...
Iterator<ExpressionAndContext> subExpressionsAndContextsIterator =
result.getImmediateSubExpressionsAndContextsIterator(process);
while (subExpressionsAndContextsIterator.hasNext()) {
...
PruningPredicate prunePredicateForThisSubExpressionAndContext = prunePredicate;
if (makeSpecificSubExpressionAndContextPrunePredicate != null) {
prunePredicateForThisSubExpressionAndContext = makeSpecificSubExpressionAndContextPrunePredicate.apply(
this, prunePredicate, subExpressionAndContext);
if (prunePredicateForThisSubExpressionAndContext == TRUE_PRUNING_PREDICATE) {
continue;
}
}
though this could be handled by different pruner maker that takes into account
the expressions immediate subexpressions.
2. This looks like it will only visit the top and immediate sub-expressions,
how are descendant expressions of these intended to be visited as well?
...
Original comment by ctjoreilly@gmail.com
on 8 Feb 2014 at 12:58
Actually, ignore point 2, I see the recursion now :)
Original comment by ctjoreilly@gmail.com
on 8 Feb 2014 at 1:00
Hi Ciaran,
At gym at cell phone but quick reply:
Yes, i know TotalRewriter uses the trees,I was just talking about making ita
rewriter though may not be necessary.
Point 1: you are right,may still be done with extra level. Another option is go
straight to special purpose. Most pain there is keeping track of replaced Sub
expressions and contexts. Could abstract that functionality from replace
implementation and reuse it.
Thanks
Original comment by rodrigob...@gmail.com
on 8 Feb 2014 at 1:10
Hi Rodrigo,
Attached is a hacked up version of TotalRewriter that follows, I believe, the
approach you outlined. All the aic-expresso unit tests pass but one of the
aic-praise unit tests fails with it, so it likely has a bug. I ran the demo
e.g. 3 through it and unfortunately it took approx 25% longer to run on my
machine. Anyway, as I said, I just hacked this together so likely a flawed
implementation at the moment.
Original comment by ctjoreilly@gmail.com
on 8 Feb 2014 at 2:09
Attachments:
Hi Ciaran,
Interesting hack. :) Looks right to me. Perhaps the bug on that one failing
test is on the test itself (doing things in a different order may have
uncovered it).
The odd thing is its having worse performance. I have not been able to imagine
an example in which this would be expected to do worse than the original
TotalRewriter. Can you think of one?
One (unlikely) possibility is that this code you wrote is actually recursing
unnecessarily more than once over portions of the expression. Doesn't seem like
it to me but worth keeping in mind. Please keep me posted.
Cheers, have a good weekend.
Original comment by rodrigob...@gmail.com
on 8 Feb 2014 at 3:14
Attaching a cleaned up version of the TotalRewriter.java code. This uses a
smarter mechanism for determining if are visiting non-child rewriters. In
addition, the commented out experimental equivalence code has been removed (to
make it easier to read). Same performance issues and test failing in aic-praise
still occurring. Going to now investigate why aic-praise test fails due to this
change.
Original comment by ctjoreilly@gmail.com
on 26 Feb 2014 at 12:27
Attachments:
Hi Ciaran,
What do you mean by determining if are visiting non-child rewriters? I took a
look but it is hard to pinpoint something like that.
Thanks.
Original comment by rodrigob...@gmail.com
on 26 Feb 2014 at 8:36
Hi Rodrigo,
The line:
if (pruningPredicate == childPruningPredicate) {
essentially uses the pruning predicate created by the pruning predicate maker
to determine if the expression being visited is a child or descendant
expression (basically this works due to having implicit knowledge of how the
AbstractExpression.replace logic works). Any clearer?
Original comment by ctjoreilly@gmail.com
on 26 Feb 2014 at 9:54
I see. Yes. Still puzzled how this could be slower than the original. Perhaps a
good thing to do would be to identify a small example in which this happens,
for analysis. Thanks.
Original comment by rodrigob...@gmail.com
on 26 Feb 2014 at 10:03
Hi Rodrigo,
Ok, I believe I have figured out why the experimental TotalRewriter performs
poorer than the existing rewriter logic.
The failing example LBPTest.testSetDifference():
new SetDifferenceTestData(Expressions.TRUE.toString(), new
TrivialLoopyMisconceptionExample(),
"{ ( on A, A', B ) ( ([ m(A) ]), ([ if gA(A') and gB(B) then if m(A') then if m(B) then 10 else 1 else (if m(B) then 5 else 30) else 1 ]) ) | A != X and A' != B and (A = A' or A = B) and (A' != A or B != X) }",
"{ ( on B, A, B' ) ( ([ m(B) ]), ([ if gA(A) and gB(B') then if m(A) then if m(B') then 10 else 1 else (if m(B') then 5 else 30) else 1 ]) ) | X != B and A != B' and (B = A or B = B') and (A != X or B' != B) }",
false,
"{ ( on A) ( ([ m(A) ]), ([ if gA(X) and gB(A) then if m(X) then if m(A) then 10 else 1 else (if m(A) then 5 else 30) else 1 ]) ) | X != A }"),
highlights the issue. When run with the current TotalRewriter (which pops back
up to the top expression every time there is an exhaustive change to a single
sub-expression), R_simplify is called with the following expression:
+lbp.R_simplify(for all B'' : (for all A'' : (for all B' : X != B'' and A'' !=
B' and (B'' = A'' or B'' = B') and (A'' != X or B' != B'') => ( ([ m(A) ]), ([
if gA(A') and gB(B) then if m(A') then if m(B) then 10 else 1 else (if m(B)
then 5 else 30) else 1 ]) ) != ( ([ m(B'') ]), ([ if gA(A'') and gB(B') then if
m(A'') then if m(B') then 10 else 1 else (if m(B') then 5 else 30) else 1 ])
)))) - under context variables = [A, A', B, X], constrained by A != X and A' !=
B and (A = A' or A = B) and (A' != A or B != X)
which ends up calling R_quantifier_elimination with the following expression
and creating a simplified result:
for all B'' : (for all A'' : (for all B' : (X = B'' or A'' = B' or B'' != A''
and B'' != B' or A'' = X and B' = B'') or (A != B'' or (A' != A'' or B != B'))))
-->
A' = X and B = A
However, with the new/experimental Total Rewriter, because it exhaustively
rewrites all sub-expressions before moving back up to the parent expression the
inner quantified expressions are passed to R_quantifier_elimination from inside
out instead of from the outside in, as is demonstrated in the following
snippets from the trace:
for all B' : X = B'' or A'' = B' or B'' != A'' and B'' != B' or A'' = X and B'
= B'' or (A != B'' or (A' != A'' or B != B'))
-->
X = B'' or (B'' != A'' or (A'' = X or A != B'' or A' != A'' or A'' = B) and (A
!= B'' or A' != A'' or A'' = B or B = B'')) and (A'' = B'' or A'' = X or A !=
B'' or A' != A'' or B != B'')
for all A'' : X = B'' or (B'' != A'' or (A'' = X or A != B'' or A' != A'' or
A'' = B) and (A != B'' or A' != A'' or A'' = B or B = B'')) and (A'' = B'' or
A'' = X or A != B'' or A' != A'' or B != B'')
-->
X = B'' or (A != B'' or B != B'' or A' = B'' or A' = X) and ((A != B'' or A' !=
B'' or B'' = B) and (A != B'' or A' != B'' or B = B'') or (A != B'' or A' !=
B'' or B'' = B) and (A != B'' or A' != B'' or B = B'') and (A != B'' or A' !=
B'' or B != B''))
for all B'' : X = B'' or (A != B'' or B != B'' or A' = B'' or A' = X) and ((A
!= B'' or A' != B'' or B'' = B) and (A != B'' or A' != B'' or B = B'') or (A !=
B'' or A' != B'' or B'' = B) and (A != B'' or A' != B'' or B = B'') and (A !=
B'' or A' != B'' or B != B''))
-->
(A' = X or B != A or A' = A) and (A' != A or A = B) and (A' != A or B = A)
ending up returning a sub-optimal answer and performing a lot more processing.
The two attached trace files (trace_current.txt and trace_new.txt, which were
generated with no rewriters filtered in the trace output), provides full
details of the differences in the processing that occurs. Of particular
interest is the fact that the new trace file's size is > 4 times that of the
current trace generated.
The main issue appears to be that the R_quantifier_elimination logic works best
when going from the outside in when solving nested quantifiers, i.e.:
| if F is "for all x: Y"
|.... return R_normalize(R_card(|R_top_simplify(Y)|X, "for all") = |type(x)| )
| if F is "there exists x: Y"
|.... return R_normalize(R_card(|R_top_simplify(Y)|X, "there exists") > 0)
I've attached a new updated version of TotalRewriter that takes this knowledge
into account in that it will apply its intended exhaustive rewriting before
returning to a parent expression in all cases except when the expression is
quantified whereby it then behaves in the same manner as the existing
TotalRewriter logic. This new version now passes all the aic-expresso and
aic-praise tests and runs the PRAiSEDemoApp example 3 in the same amount of
time as the current rewriter. Unfortunately, while the slowdown has been
removed, no speed up has been achieved either.
Let me know what you think.
Original comment by ctjoreilly@gmail.com
on 27 Feb 2014 at 5:55
Attachments:
Hi Ciaran,
Wow, thanks for the great analysis!
It's still unclear to me why feeding quantifier elimination from inside-out
would be worse than the original outside-in. The reason I say that is because
the original outside-in works by recursively solving inner formulas, so it ends
up being inside-out too, no? Do you have some insight on this?
Rodrigo
Original comment by rodrigob...@gmail.com
on 28 Feb 2014 at 1:46
Hi Rodrigo,
Good point, looking closer the variation in the outputs appears to occur when
performing R_quantifier_elimination in the most inner quantified expression, in
the current logic this is passed:
+card.R_card(( (| { ( on B' ) ( B' ) | (X = B'' or A'' = B' or B'' != A'' and
B'' != B' or A'' = X and B' = B'') or (A != B'' or (A' != A'' or B != B')) }
|), 'for all' )) - under context variables = [A'', B'', A, A', B, X],
constrained by A != X and A' != B and (A = A' or A = B) and (A' != A or B != X)
but in the new logic this is passed:
+card.R_card(( (| { ( on B' ) ( B' ) | X = B'' or A'' = B' or B'' != A'' and
B'' != B' or A'' = X and B' = B'' or (A != B'' or (A' != A'' or B != B')) }
|), 'for all' )) - under context variables = [A'', B'', A, A', B, X],
constrained by A != X and A' != B and (A = A' or A = B) and (A' != A or B != X)
Note the subtle difference in the disjunction because the new rewriter has
already exhaustively rewritten the inner disjunction to be 5 disjuncts in this
case as opposed to 2 in the current logic (though equivalent) the
R_card_disjunction logic branches differently, in the old logic F1 and F2 are
disjunctions but in the new logic F1 ends up being (referring to pseudo-code):
if F1 contains all the disjuncts independent of X (not empty)
return R_normalize(if F1 then ||X|| else R_card(|F2|_X, quantification))
which upon exit from the rewriter ends up with a much more complex expression:
if X = B'' then 10 else (if B'' = A'' and (A'' != X and A = B'' and A' = A''
and A'' != B or A = B'' and A' = A'' and A'' != B and B != B'') or A'' != B''
and A'' != X and A = B'' and A' = A'' and B = B'' then 0 else 10)
than what is returned by the current logic:
if A = B'' and A' = A'' and X != B'' and A'' != B and (B'' = A'' or B'' = B)
and (A'' != X or B != B'') then 0 else 10
Let me know what you think and how you'd like to proceed.
Original comment by ctjoreilly@gmail.com
on 3 Mar 2014 at 10:57
I see, Ciaran, thanks.
Well, I can't think of a reason why the new TotalRewriter would produce a more
complicated expression like that.
It seems we have two things we could do now:
A - try to find out why this expression is more complicated. This would go by
understanding how it was done before and which "good steps" (that is, whatever
led to the more efficient representation) was taken before that is not being
taken now anymore, and whether that is due to the change in evaluation order or
if it is just a coincidence.
B - try to figure if this is an ad hoc situation without a deeper relation to
the new evaluation order, or if we get lots of tests in which the new order is
slower, and try to identify a common pattern there; for example, are all slower
tests involving slower quantifier elimination?
I thing B is easier than A.
Are these something you are interested in doing? If not, we could just sit on
it and do something else. In that case, it would probably be good to merge the
branch back with a configuration option for selecting between the two modes so
we can revisit this later (leaving it sitting on an unmaintained branch would
probably make it out of date fairly quick).
Please let me know what you would like to do.
Original comment by rodrigob...@gmail.com
on 4 Mar 2014 at 5:40
The experimental logic has been added into the trunk version of TotalRewriter,
Subversion revision r788. Disabled by default, see comments in code for further
details.
Original comment by ctjoreilly@gmail.com
on 27 Mar 2014 at 10:54
Original issue reported on code.google.com by
rodrigob...@gmail.com
on 7 Feb 2014 at 1:30