Closed JustinSGray closed 1 year ago
I don't believe this is surprising. rank is not well defined numerically so different BLAS can give different results.
rank
appears to have a precise mathematical definition in its docstring. I imagine that implementing that definition exactly might be computationally difficult or impractical, though. As rank
is currently documented, I would call this a bug.
@JustinSGray Could you please post the results of
(minimum(svdvals(A)), 3*eps(eltype(A)))
for A = arc_points
? On my Mac, I get that the first divided by the second yields 4.5.
I get
On windows.
(2.5448910415241994e-14, 6.661338147750939e-16)
julia> first/second
38.20390115435632```
One Linux, I get `4.500860108262188`
This is just a case of floating point differences on CPUs/OSs. Both values are very close to 0 there, and something like rref
from RowEchelon says the matrix is rank 3, even on linux for me.
Functions returning integers don't have to have floating point rounding errors because the mathematically correct answer is always either representable as an integer or too large and an error is thrown.rank
is one of these functions, and its semantics could, in theory, be implemented in a floating-point-error free way (e.g. using extended precision). Every input to rank
has a single mathematically correct and exactly representable answer (though it might be hard to find due to #48098) and we do not always produce that answer.
It would be great if someone could figure out how to efficiently produce the correct answer but failing that, this limitation should be documented.
@LilithHafner this isn't practically true. rank
just isn't an especially meaningful function in a numerical linear algebra context. For example, in exact arithmatic rank([2 3; .2 .3])
is 2 (since .2*3/2
isn't exactly .3
but that doesn't make 2
a useful answer here. Computing the "exact" rank requires exponentially many bits (so it's impossible for matrices >30x30 or so) and isn't a useful concept anyway.
This is the useful and mathematically unambiguous definition according to the docstring:
Compute the rank of a matrix by counting how many singular values of A have magnitude greater than max(atol, rtol*σ₁) where σ₁ is A's largest singular value.
It is mathematically unambiguous because we can think of rtol*σ₁
, σ₁
and the other singular values as real numbers rather than as floating point representations of a real numbers. It is useful because according to its semantics rank([2 3; .2 .3]) = 1
.
Ah, sorry, I messed it up. You should look at (i) the maximum singular value, (ii) multiply it by 3eps()
and compare to the (ii) minimal singular value. Can you post those three numbers for each OS? On my Mac, (i) * (ii) is 10 times as big as (iii), so the (numerical) rank is pretty clearly 2 for those results. So, something must be different by an order of magnitude on Windows?
In any case, you have a very ill-defined matrix (in Float64) to begin with:
julia> cond(arc_points)
1.2126687153030292e16 # in particular > inv(eps())
so it may be generally very hard to trust (almost) any computation (in Float64) with this matrix.
It's an order of magnitude difference, but it's not that large of a difference, it seems the smallest svd value that they're getting is 2.5448910415241994e-14 while I get 2.9981751136857335e-15 . It's almost 10x smaller, but both are very close to 0 so it's not an unexpected result.
It's an order of magnitude difference, but it's not that large of a difference, it seems the smallest svd value that they're getting is 2.5448910415241994e-14 while I get 2.9981751136857335e-15 . It's almost 10x smaller, but both are very close to 0 so it's not an unexpected result.
Well, I think it depends on the context. An error on the order of a magnitude may be very bad. But the specific context here is the maximal singular value, and relative to it, the difference is below machine precision.
I'm closing this because I don't think there is anything to fix/actionable here (including documentation).
Oh, I agree an error of that magnitude is very bad. But that's the life of ill conditioned matrices and finite precision :). Closing seems correct.
@dkarrasch, we currently document a mathematically precise semantic and do not implement it. Would you clarify why you think that is not an issue?
Unless I'm mistaking what you mean, we follow it. It's just that SVD isn't an exact process.
One actionable take away is to add a note/warning that the result of rank
can be different on different architectures, os and BLAS libraries dues to differences in association of floating point ops.
It's just that SVD isn't an exact process.
SVD is an exact process. For example, the singular values of
1 0 0 0 2
0 0 3 0 0
0 0 0 0 0
0 2 0 0 0
are exactly 3
, √5
, and 2
.
While the Matrix{Float64}
implementation of svdvals
cannot be exact because it returns a Vector{Float64}
, someone who is not familiar with the interaction between extended precision and singular value decomposition might assume that rank
is implemented exactly according to the thresholds indicated in its docstring. Given that
a) The docstring specifies a sensible, single, unambiguous answer for all inputs b) It is possible to compute that correct answer c) We do not always compute that correct answer (or even something ≈ to it) d) It requires a deep understanding of computational linear algebra to know that it requires exponentially many bits to always compute the correct answer
I think it is worth documenting c.
I'm not sure I follow the discussion. Computing the difference of two numbers is also an exact process, yet what you compute may be (almost) complete garbage or may have very few correct digits. That very fact is completely independent of the programming language but is due to finite precision, and what exactly the garbage part of the result looks like may dependent on all sorts of things (though it's perhaps standardized in this case). The same applies to solving ill-conditioned/ill-posed linear systems. Documenting genuine issues with finite precision arithmetics may look like we as the Julia community take responsibility for that. Also, where would we start?
The concrete problem at hand is that rank
uses information from the maximal and the minimal singular value. The latter, relative to the first, is below the representable (relative) precision, and should be perhaps considered as complete garbage. Apparently, we cannot even tell it's order of magnitude, not to mention the correctness of any digit! How reliable can the result of a function be whose "input" is complete garbage?
If we document accuracy issues, then it should be svdvals
, because rank
follows the docstring exactly AFAICT.
I wouldn't worry about documentation making it look like we're at fault for numerical discrepancies—I think it's much more likely that people will reach that conclusion if we don't document it. We can document it in a way that makes it clear that this is due to different hardware and cannot be fixed without sacrificing major performance. We could link to this SO answer for more details: https://stackoverflow.com/a/74535127/659248.
I also don't think there's any harm in repeating such warnings to some extent, so we can put a warning on both svdvals
that values and rank
, but I think that getting different results for rank
on different systems is much more surprising because it feels like a discrete computation, but is actually laundering it's compute through a bunch of very complex floating point machinery that is highly system-dependent. That's why I think having the warning on rank
specifically would be warranted.
rank
follows the docstring exactly AFAICT.
This would be true if the rank
docstring referred to the output of svdvals(A)
but it refers to the singular values of A
which are exact theoretical values.
FWIW, I have a modest familiarity with linear algebra and floating point issues generally. From my perspective, a note about the practical implementation using svdvals and its associated potential for numerical instability on highly singular matrices would have made it so that I was not surprised that we got different ranks on different architectures for the data I gave it here.
So I agree that a note about potential hardware/OS discrepancies on highly rank-deficient matricies would be helpful, and I think a reasonable resolution.
I think there is some consensus that it would be nice to document this limitation of rank
and possibly also of svdvalues
.
rank appears to have a precise mathematical definition in its docstring.
Its not true, right? I'm reading the output of the ?rank
and it says:
Compute the rank of a matrix by counting how many singular values of A have magnitude greater than max(atol, rtol*σ₁)
Nowhere in the documentation it says that it returns a mathematical definition of the rank. It simply counts how many singular values are greater than some other value. It also documents that the rank
function has a notion of "precision", consider the following example:
julia> matrix = [ 1.0 0.0; 0.0 1e-16 ]
2×2 Matrix{Float64}:
1.0 0.0
0.0 1.0e-16
julia> rank(matrix)
1
Which is clearly false from a mathematical point of view, because iszero(1e-16) === false
, but it still a correct result from a limited precision point of view (and a similar example is documented as well).
but it refers to the singular values of A which are exact theoretical values.
Singular values of A
are exact theoretical values, but the documented version of the rank
function does explicitly say that it simply counts "non-zero" singular values with a given precision.
I agree that a more explicit warning on rank
may be helpful. I wouldn't put one on svdvals
People are used to the idea that any floating-point computation involving theoretically continuous outputs, from complicated things like svdvals
and qr
to simple things like sum
and sqrt
, generally produce inexact results, and that the precise outputs may depend slightly on architecture/compiler/implementation details. So it makes no sense to me to put a "warning" only on a particular function like svdvals
.
However, people tend to fall into a false sense of security that any function with integer outputs is necessarily exact. So I think it's useful to remind people that, although rank
produces an integer output, the result is a count of inexact floating-point singular values above some threshold and hence can vary between architectures or even between Julia releases if you have an ill-conditioned matrix with singular values close to the threshold.
In numerical linear algebra, what rank(A)
is computing is commonly referred to as a numerical rank rather than the "rank" per se. e.g. Golub & van Loan write:
Indeed, it might be useful to have an expanded API (e.g. rankerr
or ranktol
) that returns both a numerical rank and an "error bar" based on the tolerance δ and the number of singular values close to δ. On the other hand, anyone who knows enough to use such a function might as well just look at svdvals
directly.
@bvdmitri, I believe that three definitions are at play and confusing this discussion:
1) The dimension of the vector space defined by the column vectors
2) The number of singular values that fall below a certain threshold
3) The number of outputs of svdvals
that fall below a certain threshold
rank
is documented to follow the second definition and actually follows the third. 1 and 2 are both precise mathematical definitions. When I refer to the "precise mathematical definition in its docstring" I am referring to definition 2.
wrt documenting this, I agree with @stevengj that the core of the issue is that returning an integer implicitly (and in this case falsely) guarantees an exact answer. IIUC all other operations in Base or LinearAlgebra that produce integral results are supposed to produce exact results with the exception of the clearly doccumented rationalize
.
test script:
Windows Result:
Linux Result: