asm-js / validator

A reference validator for asm.js.
Apache License 2.0
1.78k stars 148 forks source link

Define "asm.js integer literal" once and use the name throughout the spec #57

Closed jruderman closed 10 years ago

jruderman commented 11 years ago
cscott commented 11 years ago

I assumed 1e3 would be a valid way to write the integer 1000. Some minimizers use hacks like this to save space. But you're correct, integer literals should not be allowed to have negative exponents!

curiousdannii commented 11 years ago

See also #53

cscott commented 11 years ago

I verified that the spidermonkey parser accepts 1e3 as an integer literal. FWIW.

cscott commented 11 years ago

In my asm.js validator I use the rule which spidermonkey seems to use, which is: "an asm.js integer numeric literal is a literal written without '.' in the source, whose value is an integer". This allows 1e3, 1e0, and 1e-0, while disallowing 1e-1.

Note issue #67 bears on this as well, if you want to allow negative integer literals.

@jruderman's first point is a good one, though: asm.js is requiring the tokenizer to preserve the "dotness" of integer literals. The only (partial) way around this (that I can think of) is to distinguish double 0 from int 0 by writing the former as +0. Carried to its logical end, though, negative double literals would be written as +(-5), which is a little odd.... and there's no particular reason to believe that the leading unary operators won't get constant folded away in a bytecode representation anyway. It might be worth a sentence in the spec to describe the tokenizer change.

cscott commented 11 years ago

Note that the spidermonkey rule also allows 0e-3 as a valid integer literal.

ghost commented 10 years ago

You're right; the use of . to distinguish doubles from ints does mean the tokenizer has to remember dotness, but (1) that isn't hard, (2) it avoids a more complex definition that operates in terms of IEEE754 doubles. It is important to be razor-sharp on this definition and I think the '.' rule achieves this. Also, we can't use normal minifiers on asm.js anyway (they break all sorts of structure). Emscripten has its own minifier which takes advantage of the structure of asm.js and is, consequently, way way faster on large codes. I don't think there is a bug here, so tentatively closing.

ghost commented 8 years ago

On the face of it, this may seem like a minor issue, but it is actually quite pernicious. Observe that representing a numeric literal—whose source contains a negative exponent, but not a . character—by an integer may introduce a loss in precision. (May because there is no such loss for, e.g., 1e-0.) It follows that AOT compilation may affect the observational behaviour of JavaScript code, which is contrary to the intentions of asm.js. Consider, for example, the following valid module:

function M() {
  "use asm"
  function f() {
    return 1e-3
  }
  return f
}

Without AOT compilation, M()() yields 0.001, but with AOT compilation, it yields 0. Hence, a more sophisticated procedure for deciding how to represent a numeric literal is required. In particular, a procedure that satisfies the following conditions would avoid the aforementioned bug:

n resolves to an integer representation     ⇒     MV(n) is an integer n resolves to a floating point representation     ⇐     the source of n contains a . character

where MV( · ) is defined in section 11.8.3.1 of ECMA-262/6.

kripken commented 8 years ago

Nice find, that's a bug that went unnoticed I guess.