kyren / piccolo

An experimental stackless Lua VM implemented in pure Rust
Creative Commons Zero v1.0 Universal
1.67k stars 62 forks source link

fix: be correct so we don't burn from Zs #77

Closed Jengamon closed 3 months ago

Jengamon commented 3 months ago

I caught too many Zs...

luckily the manual had something to say about this:

Unless stated otherwise, any overflow when manipulating integer values wrap around, according to the usual rules of two-complement arithmetic. (In other words, the actual result is the unique representable integer that is equal modulo 2n to the mathematical result, where n is the number of bits of the integer type.)

so here we are

UPDATE

I would be fine with just whatever saturating_add does, even if it doesn't match I don't want to mimic weird overflow behavior

kyren has spoken.

UPDATE UPDATE

too many Zs spoiled the broth.

kyren commented 3 months ago

I would be fine with just whatever saturating_add does, even if it doesn't match I don't want to mimic weird overflow behavior

kyren has spoken.

No, never mind, I was confused about what behavior I was seeing. After trying the tonumber("ZZZZZZZZZZZZZZZZZZZZZ", 36) example with varying number of Zs, you hit a maximum where any number of further Zs always returns -1.

This made me think that "oh, okay, obviously PUC-Rio Lua is computing the sum as an unsigned integer, doing saturating math on it, then converting it back to a signed integer, so it ends up as -1". This would be a silly behavior to mimic, and it seemed reasonable to instead max out at the signed max, rather than the unsigned max.

HOWEVER, that is not what is happening. As it states in the manual, overflowing integers wrap around. But then, how is tonumber("ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ", 36) == -1, as well as any increasing number of Zs?

Well, it turns out this operation has a fixed point...

((2^64 - 1) * 36 + 35) % (2^64) == 2^64 - 1

This fixed point is obvious because, in modular arithmetic terms... this is (-1) 36 + 35 == -1, which is not so strange! The coincidence is that mod(sum(35 36^n, n, 0, 31), 2^64) just so happens to land on -1, and once it lands here it stays here. I'm sure there is a deep numerical reason for this, but I'm too tired to go any deeper right now.

So, with this knowledge, wrapping math matches PUC-Rio Lua better, and we should do that.

kyren commented 3 months ago

This fixed point is obvious because, in modular arithmetic terms... this is (-1) 36 + 35 == -1, which is not so strange! The coincidence is that mod(sum(35 36^n, n, 0, 31), 2^64) just so happens to land on -1, and once it lands here it stays here. I'm sure there is a deep numerical reason for this, but I'm too tired to go any deeper right now.

It's not that deep.

Okay, a repeating max digit (like 9999999... or ZZZZZZ....) is always (b^n - 1) for the base b and number of digits n.

We can rewrite this as...

(2 b / 2)^n - 1 2^n (b / 2)^n - 1 2^64 2^(n - 64) (b / 2)^n - 1

So if b has any factor of two (b is even), and n is >= 64, then this equation is 2^64 [an integer] - 1, which is congruent to -1 mod 2^64. 36 is factored as 3 3 2 2, so it has 2^2 as a factor, so the point at which tonumber("ZZZZZZ...", 36) becomes -1 is at 32 or more Zs.