JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.73k stars 5.48k forks source link

[RFC] What is the integer overflow story in Julia? #50486

Open LilithHafner opened 1 year ago

LilithHafner commented 1 year ago

For simple operations (e.g. +, *, -) we do modular arithmetic and do not warn on overflow. For complex operations, we sometimes check for intermediate overflow (lcm, getindex, binomial) and sometimes do not, (isapprox, lcm)

I'd like a clear, documented, policy of when we let overflow happen, when we throw, and when we make sure to avoid it with widen or similar.

The first step is to come up with the policy. Current documentation is quite limited. Ideas?

stevengj commented 1 year ago

See also #21600.

JeffreySarnoff commented 1 year ago

Regarding a policy for the treatment of integer overflow.

It is easier to make the case for treating unsigned bitstypes as modular values (permitting operations to wraparound either way) than to make a compelling case for treating signed bitstypes as modular values. Many uses of unsigned bitstypes are designed with a reliance on the presence of unchecked wraparound.

It is rare to design an algorithm that requires unchecked wraparound for signed bitstypes. The only common rationale for allowing signed bitstypes to wrap (over or under) is that provides accelerated throughput and the associated simplification of instruction flow is easier to optimize.

Integer Wraparound is #14 on the 2023 CWE list of Top 25 Most Dangerous Software Weaknesses

CWE™ is a community-developed list of software and hardware weakness types. It serves as a common language, a measuring stick for security tools, and as a baseline for weakness identification, mitigation, and prevention efforts. (The Mitre Corporation underwrites CWE™)

nhz2 commented 1 year ago

I think of Int64 as the 64 least significant bits of a hypothetical BigInt sign extended.

Therefore, I believe the following equality should hold for all basic integer operations, or they should throw an error or have some warning in their docstring.

op(x::Int64, y::Int64) == op(big(x), big(y)) % Int64

For example, one case where the above equality doesn't work is with ^:

1 ^ -2 == 1.0 Int64(1) ^ Int64(-2) == 1 big(1) ^ big(-2) throws a DomainError.

But I guess this is okay because the docstring for ^ says it has special behavior when using Int literals.

Though I'm not sure why big(1) ^ big(-2) errors here.

PallHaraldsson commented 1 year ago

one case where the above equality doesn't work is with ^:

1 ^ -2 == 1.0 Int64(1) ^ Int64(-2) == 1

That's because arguably Integer ^ Integer should in general be Float64 (if not Rational), one of few flaws in Julia (only exception when b of a ^ b is Unsigned, then I can argue for the status quo), for the same reason Integer / Integer is Float64 (the fast approximation of Rational), which is a good decision.

Such a decision would have averted overflows or (costly software) checks (and is only done at parser time for constants), and I would like to see it changed for Julia 2.0 (see other issue on it, and what can be done for 1.x, let's not discuss this here). Insisting on Int result (for Signed b), is arguably faulty math, and I don't think many would rely on that very much. Then we wouldn't need to warn:

https://github.com/tonyhffong/Lint.jl/issues/271

This is one reason I want 2.0, to break (but in a good way), this integer result, most or all would be better off, I don't think ^ is very much used anyway for integers except for Unsigned in number theory and wouldn't be affected.

big(1) ^ big(-2) throws a DomainError.

Since negative power gives a fraction, in general not an integer, why you should do (and only the above for positive power):

julia> BigFloat(1) ^ BigFloat(-2) 1.0

It is easier to make the case for treating unsigned bitstypes as modular values

Right. And since you need to opt into them, work harder, I would supported them not checking for overflows. For (Signed) Integer you could actually have avoided any possibility of overflows, and thus checks. If just e.g. Int32 Int32 -> Int64 (fast; no checks) and Int64 Int64 -> Int128 (not as fast, nor for subsequent math, why I would prefer Int32 the default in Julia, also on 64-bit platforms...).

nhz2 commented 1 year ago

That makes sense for Julia 2.0, but in general, BigInt should be better supported. Almost every math function that works with Int inputs should work with BigInt inputs. This is definitely not the case now, so when I try and switch from Int to BigInt to avoid overflow, I typically end up getting a bunch of other errors.