Closed huonw closed 9 years ago
Ok, I think I have some idea about what is causing this. I managed to get an rr trace of a failure due to:
test pow ... thread 'safe' panicked at 'assertion failed: !overlap(wp, xs + ys, yp, ys)', src/ll/mul.rs:191
I think the problem here is the num_pow_limbs
calculation is too precise, and doesn't account for a zero limb written as the highest significant limb, e.g. consider 1 * 1
(all literals u64
).
num_pow_limbs
will deduce that 1 limb is needed and so pointers are allocated with this much space (call the result one p
)1
at p
and 0
at p.offset(1)
but p
only has size 1 so the latter is out of boundsThe corruption happens rarely because it requires just the right set-up:
num_pow_limbs
will definitely give enough space)The overlap
assertion failure seems to happen even more rarely, because is basically requires xs + ys
to be exactly (or +/-1, not sure... in any case, very close) the size of a jemalloc size class, and yp
to be allocated immediately after wp
. The assertion failure I'm looking at now has xs + ys == 768
(0x300).
I'll investigate this theory in more detail tomorrow.
(As extra evidence in favour of this, all the memory corruption I've seen has been strings being modified to have their first 8 bytes being 0.)
Quickcheck runs will very occasionally fail with an error like
which appears to be caused by the
&str
being parsed having been corrupted (there's no way the generator will generate an incorrect digit). I managed to catch this with rr with an opt+debuginfo build, which seems to point the finger at https://github.com/Aatch/ramp/blob/8c6676202ab416a636d5f9079e439fca442e6982/src/ll/mul.rs#L420-L422 but the optimisations mean a lot of stuff is optimised out.The following is the exact
pow
call in the trace I've found. The corruption appears to occur whenexp
is 1 in https://github.com/Aatch/ramp/blob/8c6676202ab416a636d5f9079e439fca442e6982/src/ll/pow.rs#L70-L94 .However, this doesn't seem reproduce any corruption by itself.
Backtrace:
I'm working on this, including trying to get a better capture, via: