Open cyanescent opened 3 months ago
The problem shows up even for a singleton array:
julia> lcm([1//2])
1//1
See the implementation:
https://github.com/JuliaLang/julia/blob/40ecf690197a7a6221cfcffd9c74799258fb4495/base/intfuncs.jl#L153
(The problem is with the init
argument; in this case, init
is 1//1
and lcm(1//1, 1//2)
is 1//1
.)
Neither lcm
nor gcd
is documented to accept an array argument by the way. We only have
gcd(x, y...)
Greatest common (positive) divisor (or zero if all arguments are zero). The arguments may be integer and rational numbers.
and
lcm(x, y...)
Least common (positive) multiple (or zero if any argument is zero). The arguments may be integer and rational numbers.
What should the methods return when the input array is empty? Currently,
julia> gcd(Int[])
0
julia> lcm(Int[])
1
What should the methods return when the input array is empty? Currently,
julia> gcd(Int[]) 0 julia> lcm(Int[]) 1
For consistency, we should have gcd(Int[])=0
and lcm(Int[])=Inf
, but Inf is not available for type Integer. Another possibility is gcd(Int[])=0//1
and lcm(Int[])=1//0
, but the type is now Rational. Anyway, for such a degenerate case, I think NaN
is the most sensible definition in both cases.
If #55318 lands, I think we could just remove the init
argument of reduce
for both gcd
and lcm
, in which case gcd([])
and lcm([])
would result in an error. I am not sure if that's (too) breaking.
Here is a correction draft, that would make gcd(Int[])
and lcm(Int[])
error (perhaps not in the most elegant way though better now): #56113.
This is not breaking since the Array argument is still undocumented, and gcd()
and lcm()
also result into an error (so this is coherent). The current behaviour is also quite wrong.
You could also define gcd(T[])
as zero(T)
and lcm(T[])
could throw an error.
I get your point, the inversion symmetry between gcd
and lcm
is broken anyway since there is no Inf
in type Int
.
But then shouldn't gcd()
also return 0
instead of an error?
(also, I don't know any reason for choosing this value rather than NaN
, my reasoning above was only deductive)
(also, I don't know any reason for choosing this value rather than
NaN
, my reasoning above was only deductive)
Answering this question (what are $\text{gcd}$ and $\text{lcm}$ for empty collections of numbers?): The extension to collections of numbers is introduced recursively through associativity: $\text{gcd}(n_1,n_2,\dots,n_r)=\text{gcd}(n_1,\text{gcd}(n_2,\dots,n_r))=\text{gcd}(\text{gcd}(n_1,n2,\dots,n{r-1}),n_r)$ and $\text{lcm}(n_1,n_2,\dots,n_r)=\text{lcm}(n_1,\text{lcm}(n_2,\dots,n_r))=\text{lcm}(\text{lcm}(n_1,n2,\dots,n{r-1}),n_r)$. Thus, we should define the case $r=1$ so that this still holds.
Therefore, gcd([])=0
and perhaps gcd()=0
would result from the common (but somewhat peculiar) convention $\text{gcd}(n_1,0)=n_1$ and the associativity rule.
The consequential rule $\text{lcm}(n_1,\infty)=n_1$ is also defined in julia for type Rational
:
julia> lcm(3,1//0)
3//1
leaning towards lcm([])=1//0
and perhaps lcm()=1//0
, or at least lcm(Rational[])=1//0
(if the output type Rational
is an issue).
You could also define
gcd(T[])
aszero(T)
andlcm(T[])
could throw an error.
In the end, lcm(Rational[])=1//0
should also be defined in this case, leading to the question:
Is lcm(T[])=1//zero(T)
acceptable from a programmatic viewpoint?
lcm(Int[])
should be 1. In this case the range of the function is Int
or the integers and in either case, all elements of the range are multiples of every element of the array (this is true vacuously) so we are looking for the least positive element of the range. That's 1.
In the case of lcm(Rational{Int}[])
, the range is Rational{Int}
or the rationals. Again, all elements of the range are multiples, but in this case there is no least positive element so we should throw.
I don't think it has to throw necessarily, although I'd be fine with that.
My preference would be to use the FTA definition
aka where n = prod(p_i ^ e_i)
, allowing negative exponents over rationals, gcd(n1, n2) = prod(p_i ^ min(e_i1, e_i2))
and lcm(n1, n2) = prod(p_i ^ max(e_i1, e_i2))
which preserves the identity gcd(a, b) * lcm(a, b) = a * b
for positive a, b
by convention gcd, lcm
are defined only up to a unit, but obviously in the rationals all values are unitary so I think it would be a common enough convention to simply make the functions even. together with the common convention that gcd(a, 0) = a
and lcm(a, 0) = 0
and commutativity this fully determines all outputs. I'll just put a list of what this would give over all the examples I've seen scattered over the PRs and mark if any deviate from status quo
lcm(1//2, 1//2) = 1//2
lcm([1//2, 1//2]) = 1//2
lcm(Int[]) = 1
, lcm(Rational[]) = 1//1
gcd(Int[]) = 1
, gcd(Rational[]) = 1//1
prod(Int[]) = 1
gcd(3//7, 6//7) = 3//7
Some more thoughts on this charmingly elusive topic:
all elements of the range are multiples of every element of the array (this is true vacuously)
I have troubles following this point, but I get the idea that shortening the Array argument should monotonously decrease the value of the lcm, hence supporting the current behaviour lcm(Int[])=1
, which is intuitive.
Applying this argument to rationals would yield lcm(Rational{Int}[])=0//1
and gcd(Rational{Int}[])=1//0
.
If we then apply the argument in my previous message about the init case of the recursive extension of gcd/lcm to collections of numbers, we are lead to give up, in the implementation for rationals, the convention $\gcd (n_1, 0 ) = n_1$ (designed to end Euclid's algorithm, in a way, 0 replaces the nonexistent Inf in Int). We would instead embrace the opposite convention gcd(a//b,1//0) = abs(a//b)
(can be intuitive for rationals, I think) and lcm(b//a,0//1) = abs(b//a)
(perhaps less intuitive but satisfies init of the recursive extension and the symmetry $\text{gcd}(\frac{a}{c},\frac{b}{d})\text{lcm}(\frac{c}{a},\frac{d}{b})=1$).
Perhaps we should not try to return the same init value for Int[]
and Rational[]
(if anything is to be returned).
The extension of lcm() and gcd() to both Array and Rational was incorrect. The case of the gcd has been corrected in #34417, but not the lcm: