arlolra / otr

Off-the-Record Messaging Protocol implemented in JavaScript
https://arlolra.github.io/otr/
Mozilla Public License 2.0
458 stars 61 forks source link

bigint.js: bpe is probably smaller than intended #41

Closed daira closed 11 years ago

daira commented 11 years ago

https://github.com/arlolra/otr/blob/master/vendor/bigint.js#L206 has the following code:

for (bpe = 0; (1<<(bpe+1)) > (1<<bpe); bpe++);  // bpe = number of bits in the mantissa on this platform
bpe>>=1; // bpe = number of bits in one element of the array representing the bigInt

From the comments, the author appears to expect bpe to end up as half of the number of significant bits in a floating point mantissa or fraction, rounded down. It does not; it ends up as 15.

Explanation:

The number of significant bits in the fraction (the correct term used in IEEE) of a JavaScript floating-point number is 52, independent of platform. (Whether you can actually use all the bits depends on some corner cases and I won't make any claims about that here.)

The code above calculates bpe = 30 in the first line. This is because the left-shift operator treats its left argument as a signed 32-bit integer, as per section 11.7.1 of the ECMAScript 5.1 spec. http://www.ecma-international.org/ecma-262/5.1/ . That is, 1<<(bpe+1) overflows to negative at bpe = 30. It then calculates bpe = 15 in the second line.

This does not affect correctness but it does make the representation unnecessarily inefficient, and in any case the comment is wrong.

daira commented 11 years ago

Note that other parts of the code may well be depending on bit operations that treat values as 32-bit (signed or unsigned) integers; I haven't checked for that.

arlolra commented 11 years ago

I agree that the comments are misleading but L187 suggests that the author is aware of JS' limitations and needs multiplication that won't overflow,

increase bpe from 15 to 30 (that would help if we had a 32*32->64 multiplier, but not with JavaScript's 32*32->32)
daira commented 11 years ago

It suggests the opposite to me; that the author is not aware of JS' capabilities. 26x26->52 bit multiplications work fine.

daira commented 11 years ago
js> x = Math.pow(2, 26)
67108864
js> x*x
4503599627370496
js> Math.pow(2, 52)
4503599627370496

(In practice you wouldn't use Math.pow, and << is problematic, but the latter can be worked around.)

daira commented 11 years ago

(Yes, I know that the above code doesn't prove that all multiplications of numbers with fewer bits work, but in fact they do.)

arlolra commented 11 years ago

Fair enough. rightShift_ needs to be rewritten because it overflows.

x[i]=mask & ((x[i+1]<<(bpe-n)) | (x[i]>>n));

https://github.com/arlolra/otr/blob/master/vendor/bigint.js#L1193

There may be others.

arlolra commented 11 years ago

https://github.com/arlolra/otr/commit/9846307942ad59b7c629134331abefdde7c475a7

daira commented 11 years ago

It doesn't overflow in a way that matters when bpe <= 31. (So bpe = 26 is fine.)

daira commented 11 years ago

BTW, that's doing a right-rotate, not a right-shift. scratch that, I missed the [i+1].

arlolra commented 11 years ago

It doesn't overflow in a way that matters when bpe <= 31. (So bpe = 26 is fine.)

Hmm, true. Alright, ignore that.

Unfortunately, hardcoding bpe = 26 sends the test suite into an infinite loop when generating DSA keys. millerRabin seems to always be returning composite.

daira commented 11 years ago

The call to rightShift_(mr_r,s) in millerRabin doesn't obviously satisfy the stated precondition 0 <= n < bpe (where n = s here).

arlolra commented 11 years ago

Yeah, but that comment and the actual implementation of rightShift_ are at odds.

daira commented 11 years ago

Oh, you're right (another comment bug). Note that rightShift_ doesn't correctly handle shift distances greater than the bit length of x; I don't know whether that matters.

arlolra commented 11 years ago

Since operands to bitwise operators are converted to 32-bit int, this'll be a problem, no? https://github.com/arlolra/otr/blob/master/vendor/bigint.js#L1412

If bpe = 26, c=s0[2*i]+x[i]*x[i] can get truncated when c>>=bpe.

arlolra commented 11 years ago

This does not affect correctness but it does make the representation unnecessarily inefficient, and in any case the comment is wrong.

By inefficient, do you mean performant? It'd be nice to benchmark this because I think the current constraint keeps the values as SMIs, which employ the tagging trick in the VM that avoids a lot of memory allocations and are generally faster than doubles that need to be boxed and unboxed.

daira commented 11 years ago

That expression needs one extra bit, because in the worst case we have (radix-1) + 2(radix-1)^2 + (radix-1) = 2(radix^2 - radix). However JS numbers can represent integers up to 2^53 exactly -- the fraction is 52 bits, but the significand has an implied 1 bit, with weight determined by the exponent. So that expression won't overflow or lose precision. I don't know where the Miller-Rabin problem is but I don't think it's there.

daira commented 11 years ago

Hmm, I think performance probably depends on the word size of the VM; if the word size is 64 bits, then all exactly representable integers can be unboxed. Some VMs probably use 32-bit words though, for example http://websrv0a.sdu.dk/ups/SCM/slides/lecture_03_mads_ager.pdf says that V8 does (it isn't clear whether this is platform-specific). So you may be right that using bpe >= 16 isn't a performance improvement in that case.

For now I suggest keeping bpe as 15 and pointing to this ticket in the comment next to it.

arlolra commented 11 years ago

Above I wasn't referring to the multiplication but the right shift of L1412.

For example, Math.pow(1<<25, 2) >> 26 === 0

daira commented 11 years ago

IonMonkey has either 32 or 64-bit words depending on platform, but does not appear to unbox integers of size more than 32 bits: https://bitbucket.org/mozilla/projects-ionmonkey/src/5cfb73435e06/js/src/jsval.h#cl-231

daira commented 11 years ago

Oh right, yes, the right shift will truncate.

arlolra commented 11 years ago

In that case, I can't imagine there'd be any performance gains in increasing bpe if we had to substitute with Math.floor(c/radix) or some other monstrosity.

The right thing to do might be to rewrite the library with TypedArrays or wait for BigInts to land in ES6 or 7.

As you said,

For now I suggest keeping bpe as 15 and pointing to this ticket in the comment next to it.

daira commented 11 years ago

Without TypedArrays or BigInts, I don't know of anything faster than:

lo = x & mask
hi = (x-lo) / radix
nadimkobeissi commented 11 years ago

I'm happily watching this from the sidelines and I'm very grateful towards both of you for these discussions. Hopefully this audit will not only improve our understanding of Cryptocat's code, but also will push improvements to related libraries. Thanks so much, you guys!

arlolra commented 11 years ago

Interestingly, that appears to be enough.

https://github.com/arlolra/otr/commit/b4ee37b17ec2e42ef167a4855655ad0eb8ca1789 feels subjectively faster.

I'll make some histograms, https://github.com/arlolra/otr/blob/master/test/plot.R

arlolra commented 11 years ago

This is master, with an average of about 3.74s/key

hist-master

And bpe, at 2.54s/key

hist-bpe

The test suite now runs in ~35s compared to ~50s before. v8 seems to like these changes, with about a 1.4x speedup compared to master. Might be interesting so see what other VMs think.

arlolra commented 11 years ago

@daira I updated the comments in ddd47892fc792dce7d7b533ac7267c74765aeb70.

Above you wrote,

Note that rightShift_ doesn't correctly handle shift distances greater than the bit length of x; I don't know whether that matters.

I added a test for that but it seems ok. Can you clarify?

daira commented 11 years ago

I was mistaken; rightShift_ does work for that case (but it isn't entirely obvious, so the test is a good idea).

arlolra commented 11 years ago

Merged. Thanks for all the help @daira.