Closed izgzhen closed 6 years ago
Currently buggy for cozy examples/maxbag.ds --java examples/MaxBag.java
:
Traceback (most recent call last):
File "/home/zgzhen/repos/cozy/cozy/jobs.py", line 29, in _run
self.run()
File "/home/zgzhen/repos/cozy/cozy/synthesis/high_level_interface.py", line 85, in run
ops=self.ops)):
File "/home/zgzhen/repos/cozy/cozy/synthesis/core.py", line 194, in improve
blacklist=blacklist):
File "/home/zgzhen/repos/cozy/cozy/synthesis/core.py", line 349, in search_for_improvements
for info in enum.enumerate_with_info(size=size, context=ctx, pool=pool):
File "/home/zgzhen/repos/cozy/cozy/synthesis/enumeration.py", line 580, in enumerate_with_info
yield from self._enumerate_with_info(context, size, pool)
File "/home/zgzhen/repos/cozy/cozy/synthesis/enumeration.py", line 603, in _enumerate_with_info
e = next(queue)
File "/home/zgzhen/repos/cozy/cozy/synthesis/enumeration.py", line 506, in _enumerate_core
for f in build_lambdas(e1_singleton, pool, sz2):
File "/home/zgzhen/repos/cozy/cozy/synthesis/enumeration.py", line 393, in build_lambdas
for lam_body in self.enumerate(inner_context, body_size, pool):
File "/home/zgzhen/repos/cozy/cozy/synthesis/enumeration.py", line 537, in enumerate
for info in self.enumerate_with_info(context, size, pool):
File "/home/zgzhen/repos/cozy/cozy/synthesis/enumeration.py", line 580, in enumerate_with_info
yield from self._enumerate_with_info(context, size, pool)
File "/home/zgzhen/repos/cozy/cozy/synthesis/enumeration.py", line 634, in _enumerate_with_info
to_keep = eviction_policy(e, context, prev_exp, context, pool, cost_model)
File "/home/zgzhen/repos/cozy/cozy/synthesis/enumeration.py", line 237, in eviction_policy
ordering = cost_model.compare(new_exp, old_exp, context, pool)
File "/home/zgzhen/repos/cozy/cozy/cost_model.py", line 196, in compare
lambda: order_objects(e1.size(), e2.size()))
File "/home/zgzhen/repos/cozy/cozy/cost_model.py", line 66, in prioritized_order
o = f()
File "/home/zgzhen/repos/cozy/cozy/cost_model.py", line 195, in <lambda>
lambda: self._compare(storage_size(e1, self.freebies), storage_size(e2, self.freebies), context),
File "/home/zgzhen/repos/cozy/cozy/cost_model.py", line 229, in storage_size
return h.storage_size(e, storage_size=storage_size)
File "/home/zgzhen/repos/cozy/cozy/structures/heaps.py", line 195, in storage_size
return storage_size(heap_elems(e).with_type(TBag(e.type.elem_type)))
File "/home/zgzhen/repos/cozy/cozy/structures/heaps.py", line 90, in heap_elems
raise NotImplementedError(repr(e))
NotImplementedError: EVar('_var77974').with_type(TMaxHeap(TInt(), TInt()))
some related comments: https://github.com/izgzhen/cozy/commit/4afa4c9bc86ab2d847c1d4e9c36bb1665dce6ed3
for the exception at hand, it is related to storage_size
. I think I am stuck now.
squash and updated unit tests
it seems like unit test failed this time because we can't generate code for TMinHeap (and probably for other similar constructs as well)
@Calvin-L I think it is not trivial to fix the unit test -- how did you generate the expressions in test?
The big expressions in the test code were copy-pasted from failing cases I encountered "in the wild". That habit led to a lot of useful regression tests, but makes the tests difficult to maintain. My advice is:
I added a ZERO-valued argument to EHeapElems in tests, and it works.
The unit test snippet is too large for me to comprehend, and I can't add anything else except for ZERO.
I think we need to confirm the following things before merging this:
heap_func
and my heap_elems
are partial -- there is NotImplementedError
inside it.check_encode
etc. implies.Also, I am not confident about the fact that filling the second parameter with ZERO
is the right thing to do
encode
doesn't use the second argument of new EHeapElems
I added a ZERO-valued argument to EHeapElems in tests, and it works. Also, I am not confident about the fact that filling the second parameter with ZERO is the right thing to do
I have not confirmed, but I think the test only verifies that it can generate code that compiles, not that the generated code actually works. So, using zero produces compile-able code, but it would not function.
heap_func and my heap_elems are partial -- there is NotImplementedError inside it.
This is fine for now.
Is the object returned from EMakeMaxHeap/EMakeMinHeap a "heap" or "Heap array"?
It is both. It is a heap, and at code generation time, heaps are represented with arrays.
For the unit tests, we might need docstring for how to maintain them and what does check_encode etc. implies.
I agree!
encode doesn't use the second argument of new EHeapElems
That is fine. encode
is for the semantics of heaps, which don't really care that they happen to be represented as arrays.
I have not confirmed, but I think the test only verifies that it can generate code that compiles, not that the generated code actually works. So, using zero produces compile-able code, but it would not function.
I suspect so as well. However, I figured that even if I crashed the generated code, I don't know what might be the correct value to fill in either (not apparent at all!).
let me try to construct another test with EHeapElems for codegen.py
@Calvin-L I've replaced the original regression01 with synthesized result from maxbag
@Calvin-L I believe this PR is ready for merging now (after build), though I have a few more follow-up work in mind.
we might need two versions of heap_elems
-- one for code-gen purpose, one for solver/eval purpose
We already have two versions. heap_elems
computes the set of elements for internal purposes. EHeapElems
produces an expression for code generation.
I reverted the heap_elems
hack and revised check_wf
it should be easy to just check out this PR and reproduce the storage_size
error by synthesizing maxbag
example
Cozy is nondeterministic. Whether it happens to discover a particular query or solution depends on how fast your computer runs and how many threads it can support efficiently. I was not able to discover the error after several tries. If you see it consistently, could you send me the log file from /tmp/
that throws the error? Those files include deterministic calls to improve
that can be reproduced on other machines.
@Calvin-L this is the log for the first failing query:
Traceback (most recent call last):
File "/home/zgzhen/projects/cozy/cozy/jobs.py", line 29, in _run
self.run()
File "/home/zgzhen/projects/cozy/cozy/synthesis/high_level_interface.py", line 85, in run
ops=self.ops)):
File "/home/zgzhen/projects/cozy/cozy/synthesis/core.py", line 196, in improve
blacklist=blacklist):
File "/home/zgzhen/projects/cozy/cozy/synthesis/core.py", line 351, in search_for_improvements
for info in enum.enumerate_with_info(size=size, context=ctx, pool=pool):
File "/home/zgzhen/projects/cozy/cozy/synthesis/enumeration.py", line 580, in enumerate_with_info
yield from self._enumerate_with_info(context, size, pool)
File "/home/zgzhen/projects/cozy/cozy/synthesis/enumeration.py", line 603, in _enumerate_with_info
e = next(queue)
File "/home/zgzhen/projects/cozy/cozy/synthesis/enumeration.py", line 506, in _enumerate_core
for f in build_lambdas(e1_singleton, pool, sz2):
File "/home/zgzhen/projects/cozy/cozy/synthesis/enumeration.py", line 393, in build_lambdas
for lam_body in self.enumerate(inner_context, body_size, pool):
File "/home/zgzhen/projects/cozy/cozy/synthesis/enumeration.py", line 537, in enumerate
for info in self.enumerate_with_info(context, size, pool):
File "/home/zgzhen/projects/cozy/cozy/synthesis/enumeration.py", line 580, in enumerate_with_info
yield from self._enumerate_with_info(context, size, pool)
File "/home/zgzhen/projects/cozy/cozy/synthesis/enumeration.py", line 634, in _enumerate_with_info
to_keep = eviction_policy(e, context, prev_exp, context, pool, cost_model)
File "/home/zgzhen/projects/cozy/cozy/synthesis/enumeration.py", line 237, in eviction_policy
ordering = cost_model.compare(new_exp, old_exp, context, pool)
File "/home/zgzhen/projects/cozy/cozy/cost_model.py", line 196, in compare
lambda: order_objects(e1.size(), e2.size()))
File "/home/zgzhen/projects/cozy/cozy/cost_model.py", line 66, in prioritized_order
o = f()
File "/home/zgzhen/projects/cozy/cozy/cost_model.py", line 195, in <lambda>
lambda: self._compare(storage_size(e1, self.freebies), storage_size(e2, self.freebies), context),
File "/home/zgzhen/projects/cozy/cozy/cost_model.py", line 229, in storage_size
return h.storage_size(e, storage_size=storage_size)
File "/home/zgzhen/projects/cozy/cozy/structures/heaps.py", line 195, in storage_size
return storage_size(heap_elems(e).with_type(TBag(e.type.elem_type)))
File "/home/zgzhen/projects/cozy/cozy/structures/heaps.py", line 90, in heap_elems
raise NotImplementedError(repr(e))
NotImplementedError: EVar('_var78661').with_type(TMaxHeap(TInt(), TInt()))
Great, I had no trouble reproducing the issue! The situation is a little complicated, but I'll try to explain.
During synthesis, Cozy needs to enumerate let-expressions. It does so by recursively enumerating all expressions in a world where there is an extra variable; for instance, if we are building a let-expression like let h = MakeHeap(...) in ___
, Cozy enumerates expressions with a free heap-type variable h
.
While doing the recursive enumeration, Cozy also needs to estimate costs for expressions. That is how we get down the path to calling storage_size
when there is a free heap-type variable with no information about it.
For let-expressions it is clear what we can do: pass information down so that storage_size
can see all let-bound variables. Unfortunately that won't work here, because Cozy also uses this same mechanism to build lambda-expressions for functional maps and filters. In that case there there is no one value for the free variable; it is any member of a given collection, as in Map(bag_of_heaps, \h -> ___)
.
It is clear to me that heap_elems
is probably the wrong design. Here are some possible alternatives I am considering:
EHeapElems
should be a "specification-only" primitive that does not take a length parameter. This is blocked on #67, which I just opened, and will require a lot of work.*Peek
operations as well.*Peek
operations as well.EHeapElems
tends to disappear if you run synthesis long enough.Enumerating elements of a heap will not be an operation we support, in which case EHeapElems should be a "specification-only" primitive that does not take a length parameter.
Heap is internal to synthesizer now -- do you mean EHeapElems should be solver/eval-only, and never leaked out to code-generation stage?
let's close this PR for now and move to the issue for discussing the new proposals :)
Yes, I do mean that EHeapElems would be solver/eval only, and should never appear in final code.
"Specification-only" includes user-written specifications and Cozy's own internal specifications for new subgoals it creates. It might also include concretization functions.
WIP