Open LilithHafner opened 1 year ago
See also #21600.
Regarding a policy for the treatment of integer overflow.
It is easier to make the case for treating unsigned bitstypes as modular values (permitting operations to wraparound either way) than to make a compelling case for treating signed bitstypes as modular values. Many uses of unsigned bitstypes are designed with a reliance on the presence of unchecked wraparound.
It is rare to design an algorithm that requires unchecked wraparound for signed bitstypes. The only common rationale for allowing signed bitstypes to wrap (over or under) is that provides accelerated throughput and the associated simplification of instruction flow is easier to optimize.
Integer Wraparound is #14 on the 2023 CWE list of Top 25 Most Dangerous Software Weaknesses
CWE™ is a community-developed list of software and hardware weakness types. It serves as a common language, a measuring stick for security tools, and as a baseline for weakness identification, mitigation, and prevention efforts. (The Mitre Corporation underwrites CWE™)
I think of Int64
as the 64 least significant bits of a hypothetical BigInt
sign extended.
Therefore, I believe the following equality should hold for all basic integer operations, or they should throw an error or have some warning in their docstring.
op(x::Int64, y::Int64) == op(big(x), big(y)) % Int64
For example, one case where the above equality doesn't work is with ^
:
1 ^ -2 == 1.0
Int64(1) ^ Int64(-2) == 1
big(1) ^ big(-2)
throws a DomainError.
But I guess this is okay because the docstring for ^
says it has special behavior when using Int
literals.
Though I'm not sure why big(1) ^ big(-2)
errors here.
one case where the above equality doesn't work is with ^:
1 ^ -2 == 1.0 Int64(1) ^ Int64(-2) == 1
That's because arguably Integer ^ Integer should in general be Float64 (if not Rational), one of few flaws in Julia (only exception when b of a ^ b is Unsigned, then I can argue for the status quo), for the same reason Integer / Integer is Float64 (the fast approximation of Rational), which is a good decision.
Such a decision would have averted overflows or (costly software) checks (and is only done at parser time for constants), and I would like to see it changed for Julia 2.0 (see other issue on it, and what can be done for 1.x, let's not discuss this here). Insisting on Int result (for Signed b), is arguably faulty math, and I don't think many would rely on that very much. Then we wouldn't need to warn:
https://github.com/tonyhffong/Lint.jl/issues/271
This is one reason I want 2.0, to break (but in a good way), this integer result, most or all would be better off, I don't think ^ is very much used anyway for integers except for Unsigned in number theory and wouldn't be affected.
big(1) ^ big(-2) throws a DomainError.
Since negative power gives a fraction, in general not an integer, why you should do (and only the above for positive power):
julia> BigFloat(1) ^ BigFloat(-2) 1.0
It is easier to make the case for treating unsigned bitstypes as modular values
Right. And since you need to opt into them, work harder, I would supported them not checking for overflows. For (Signed) Integer you could actually have avoided any possibility of overflows, and thus checks. If just e.g. Int32 Int32 -> Int64 (fast; no checks) and Int64 Int64 -> Int128 (not as fast, nor for subsequent math, why I would prefer Int32 the default in Julia, also on 64-bit platforms...).
That makes sense for Julia 2.0, but in general, BigInt
should be better supported. Almost every math function that works with Int
inputs should work with BigInt
inputs. This is definitely not the case now, so when I try and switch from Int
to BigInt
to avoid overflow, I typically end up getting a bunch of other errors.
For simple operations (e.g.
+
,*
,-
) we do modular arithmetic and do not warn on overflow. For complex operations, we sometimes check for intermediate overflow (lcm
,getindex
,binomial
) and sometimes do not, (isapprox
,lcm
)I'd like a clear, documented, policy of when we let overflow happen, when we throw, and when we make sure to avoid it with widen or similar.
The first step is to come up with the policy. Current documentation is quite limited. Ideas?