Open adon-at-work opened 9 years ago
Good catch!
This is not covered by the current tests. Changing ++delta > maxInt
into ++delta == 0
still passes all tests in the suite. I’m struggling to remember why I wrote it this way. It may have been based on one of the other C implementations, but not sure.
I took a quick look at the 5 implementations that your README has mentioned. Here's a quick summary after my very superficial lookup on whether ++delta
is checked for overflow error.
The following skipped the check (might have made some assumptions, which I've not drilled into):
The following maintained the check (++delta == 0 then overflow
):
The remaining JS implementations reached a "consensus" to check ++delta == maxInt
(It seems we lack a JS version that aligns with the spec??? I'm not sure...):
Would you mind sharing any insights on how this will be followed-up? deep dive into the punnycode? draw in insights from my colleagues or more reviewers? Pls Let me know if I can be of any help may be... :)
@adon-at-work I really appreciate the work you’re doing here. Nice job!
It would really help to have a test case that fails with the one behavior but not the other.
Here's a sample string (this particular example may not make sense for DNS :) ), which will trigger your ++delta > maxInt
condition to throw exception.
var str, arr = [], i;
for (i = 0; i < 1950; i++) { arr[i] = 0x80; }
punycode.toASCII(String.fromCodePoint.apply(null, arr.concat([13965,0x10FFFF])));
Yet, this string would not trigger the C version to overflow. It's because the ++delta == 0
there in C is testing for unsigned integer to overflow, i.e., checking if the incremented delta is larger than the maximum value of what an unsigned 4-byte long integer can hold (i.e., 0xffffffff or 4294967295).
I looked deeper into your encoder implementation. While the number representation in JS is a bit "different/weird" in the sense that it can evaluate 4294967295 + 1 === 4294967296
as true
, so that ++delta === maxInt
can be considered equiv. to ++delta == 0
in C.
But then, may be the question now becomes why maxInt was set as maxInt = 2147483647, // aka. 0x7FFFFFFF or 2^31-1
(https://github.com/bestiejs/punycode.js/blob/master/punycode.js#L26) but not 0xffffffff
?
There are no "Integers" in JavaScript, only (floating point) "Numbers".
Only for bit operations these floating point numbers are converted into signed(!) 32 bit integers and converted back into floating point after the calculation. The one exception is the >>>
operator, that returns an "unsigned 32 bit integer", that's why >>> 0
is a common idiom to "convert" a negative signed number into its unsigned counterpart.
Hi,
I find that the JS implementation of the
toASCII()
encoder here closely resembles the C sample implementation in the RFC3492 specification, except the following:In https://github.com/bestiejs/punycode.js/blob/702cc2daf10a5b97454666069e146d04745af728/punycode.js#L409,
In https://tools.ietf.org/html/rfc3492#page-27,
In particular, the JS version is checking
++delta
against whether it's larger thanmaxInt
, whereas the C version is checking against whether it's equal to0
. Is this intended, or have I missed anything when reading your code? Any clues?Thanks!