Open christoph-conrads opened 3 years ago
For simultaneous bidiagonalization of the input matrices, xORCSD2BY1 calls xORBDB2 and xORBDB3 and the out-of-bounds array accesses occur within these bidiagonalization subroutines. According the xORBDB{2,3} documentation, the input causing the misbehavior is allowed. Consequently, a fix must modify xORBDB{2,3}.
I see two possibilities for fixing this issue:
I am in favor of the latter approach because to the best of my knowledge, all LAPACK routines can handle input of dimension zero (whether this is along one axis or several axes). Hence making xORBDB{2,3} reject certain inputs of dimension zero would violate the principle of least astonishment.
I see two possibilities for fixing this issue:
* xORCSD2BY1 checks for the special case where one submatrix has no rows and the other submatrix is square, xORBDB{2,3} are updated (documentation and code) to disallow such input; * xORBDB{2,3} are fixed to handle any valid input.
I am in favor of the latter approach because to the best of my knowledge, all LAPACK routines can handle input of dimension zero (whether this is along one axis or several axes). Hence making xORBDB{2,3} reject certain inputs of dimension zero would violate the principle of least astonishment.
Consider the constraints on the input dimensions, see sorbdb2.f
starting at line 64:
m
, n
, p
integersm ≥ 0
is the number of rows of [X1; X2]
q
is the number of columns of [X1; X2]
subject to 0 ≤ q ≤ m
p
is the number of rows of X1
subject to 0 ≤ p ≤ min(m - p, q, m - q)
.([X1; X2]
is Matlab-like notation and means the rows of X1
on top of the rows of X2
and X1
, X2
having the same number of columns.)
Have a look at the loop with the out-of-bounds access: https://github.com/Reference-LAPACK/lapack/blob/2dafa3d2756a7825c23a8c8456781561e36668ae/SRC/sorbdb2.f#L317-L322
If M = Q
and I = Q
in the last iteration, then X21(I+1,I) = X21(M + 1, 1)
in the call to SLARFGP:
CALL SLARFGP( 1, X21(M, M), X21(M+1, M), 1, TAUP2(M) )
SLARFGP
then compute in line 144 the Euclidean norm of the vector of dimension zero starting at X21(M + 1, M)
. snrm2
correctly infers the norm to be zero (line 125) without referencing X21(M + 1, 1)
. Based on the norm, slarfgp
will then continue to compute the elementary reflector without referencing X21(M + 1, 1)
in lines 144 to 164:
@weslleyspereira Like I wrote #406, this is another -fcheck=all
false positive.
Hi @christoph-conrads. Thanks for the complete description! This is very helpful!
I understood this is a false positive, and that SLARFGP
doesn't use X21(I+1,I)
when N=1
. Thanks! So, in practice, nothing unexpected happens when we turn off the flag. (all the tests pass when we turn off -fcheck=all
). Equally, SLARF
will not use X21(I,I+1)
, since tau=0
. However, I am in favor of continue using some flags, like -fcheck=all
, so as to avoid true out-of-bounds (and other) errors.
I propose the following change. Replace:
by:
DO I = P + 1, MIN( Q, M - P - 1 )
CALL SLARFGP( M-P-I+1, X21(I,I), X21(I+1,I), 1, TAUP2(I) )
X21(I,I) = ONE
CALL SLARF( 'L', M-P-I+1, Q-I, X21(I,I), 1, TAUP2(I),
$ X21(I,I+1), LDX21, WORK(ILARF) )
END DO
*
DO I = M - P, Q
TAUP2(I) = ZERO
X21(I,I) = ONE
END DO
With this change, all tests pass with -fcheck=all
.
I can submit a PR with the fix if everything is ok.
I propose the following change. [snip]
DO I = P + 1, MIN( Q, M - P - 1 ) CALL SLARFGP( M-P-I+1, X21(I,I), X21(I+1,I), 1, TAUP2(I) ) X21(I,I) = ONE CALL SLARF( 'L', M-P-I+1, Q-I, X21(I,I), 1, TAUP2(I), $ X21(I,I+1), LDX21, WORK(ILARF) ) END DO * DO I = M - P, Q TAUP2(I) = ZERO X21(I,I) = ONE END DO
X21(I, I)
is nonnegative for otherwise must TAUP2(I, I) = 2
, cf. SRC/slarfgp.f
, line 146+. I do not see why this should be the case.X11
is square and X21
has no rows.SRC/sorbdb2.f
, line 279, will produce out-of-bounds accesses because of the references X11(I+1, I)
, X11(I, I+1)
(if P = Q
), and X11(I+1, I+1)
.M - P = Q
.Thanks, @christoph-conrads!
I agree with (1). I will prepare a PR including your correction for negative X21(I, I)
. I will also look at SORBDB2 and SORBDB3, thanks.
Description
The 2x1 CS decomposition decomposes a matrix
Q = [Q1; Q2]
, whereQ^* Q = I
. There are two possibilities for the CS decompositionUΣV = Q1
in the special case whereQ2
has no rows:U = Q1
,Σ = I
,V = I
,U = I
,Σ = I
,V = Q1
.That is, the only thing to do by the algorithm is copying the input matrix into U or V and set the other matrix to the identity matrix.
If
Q1
orQ2
has no rows andthen SORCSD2BY1 performs out-of-bounds array accesses. I assume the same behavior occurs also for DORCSD2BY1, CUNCSD2BY1, and ZUNCSD2BY1. My expectation is that xORCSD2BY1 does not perform out-of-bounds accesses and returns one of the two possible decompositions shown above.
Output:
Checklist