Closed jpziegler closed 6 years ago
Here is another variant of the example, using fewer language features:
foo = z^z
where
as = [0x0] # as
a = as@0
b = a^a
c = b^b
d = c^c
e = d^d
f = e^e
g = f^f
h = g^g
i = h^h
j = i^i
k = j^j
l = k^k
m = l^l
n = m^m
o = n^n
p = o^o
q = p^p
r = q^q
s = r^r
t = s^s
u = t^t
v = u^u
w = v^v
x = w^w
y = x^x
z = y^y
Tweaking the bounds on the i <- [0 .. 31]
clause pretty clearly indicate that this program is spending exponential time doing whatever it's doing. I'll try to track down why.
Good news, this turns out to be a fairly straightforward loss-of-sharing problem. Adding a thunk in the appropriate place turns this back into a very quick linear-time algorithm.
I'm now trying to determine if there are additional places like this where desirable sharing is being lost.
@jpziegler, please reopen if this doesn't fix your non-minimized issue.
I think we have another example. This runs fast in an older (2.4-series) Cryptol, but slows down exponentially in newer versions (2.5-series).
The mod_pow
function can be found in the following file from examples
:
cryptol/examples/MiniLock/prim/mod_arith.cry
cryptol> mod_pow (~0, ~0, ~0:[17])
0x00000
cryptol> mod_pow (~0, ~0, ~0:[18])
0x00000
cryptol> mod_pow (~0, ~0, ~0:[19])
0x00000
cryptol> mod_pow (~0, ~0, ~0:[20])
0x00000
The first takes about 1 second, then it doubles from there.
I just bisected this. The first bad revision is cbdbe35a89c15432fe9c0f717adb3bc30365e7cf.
Reverting the change to the ETuple
case of evalExpr
(replacing let xs = map eval es
with xs <- mapM (delay Nothing . eval) es
) fixes the exponential slowdown. (If we revert this, we should also revert the similar line on the ERec
case.)
Since I don't understand the original reason for the change, I'll let @robdockins comment on whether it's a good idea to revert this.
I believe I have encountered another issue regarding the exponential slowdown. Here is an example snippet of code:
permute : [4][8] -> [4][8]
permute vec = rets!0 where
rets = [vec] # [ [r3 ^ r0 ^ r2,
r0,
r1,
r2] | n <- [0..30] | [r0, r1, r2, r3] <- rets]
Running permute [1, 2, 3, 4]
takes around 5 seconds when n <- [0..30]
, and takes around 10 seconds when n <- [0..31]
.
I should note that this is in version 70a4728. I'm not positive as to whether this issue was patched with commit 134d8cc on March 7.
Both appear to be instantaneous with current Cryptol (984012a)
Here's a (weird) minimized example demonstrating the issue.
ys
takes a long time to calculate. Lots of simple "optimizations" make this execute very quickly, so this was as small as I could get while still demonstrating the behavior.Older versions of Cryptol don't have the slow-down problem.
Example slow version:
13fef57
Example fast version:054a3e2
Not sure if those version hashes are useful, but I thought I'd include them just in case.