Open loiseaujc opened 1 month ago
Great idea @loiseaujc. Rn I'm trying to tackle the decompositions (pseudo-inverse, Cholesky, QR) and I was planning to address norms and condition number next. So, your contribution would be very welcome!
As a way to separate between the two approaches to norm, would there be a way to use the same norm
interface for all, but with different arguments maybe? I was loosely thinking that something like norm(A,2)
vs. norm(A,'Frobenius')
could work.
We probably could. Julia is more exhaustive than SciPy for this. Having norm
and opnorm
allows them to have both the element wise p-norm (for abitrary p) and the induced one (for p=1, 2 and infinity). SciPy is restricted to the standard operator induced norms + the Frobenius one. I think it mostly depends on how exhaustive we want to be (and how often the non-Frobenius element wise norms are actually being used which I think is not much).
The more I think about it, the more it feels like the Julia way may actually cover more use-cases than are actually used in practice. I still like the difference between norm
and opnorm
though just to recall the user that these are fundamentally different mathematical objects. Having the two interfaces might also make it more transparent when a 2D array represent an actual matrix or a collection of vectors instead of having to play with the axis
keyword.
I took some time to think about all the different norms (both for vectors and matrices) I've ever needed in my scientific computing adventure over the past 15 years or so. Here is a pretty standard list (+ whatever scipy
/Julia
are offering).
norm(x, 1)
: 1-norm of a vector (i.e. sum of absolute values),norm(x, 2)
: 2-norm of a vector (i.e. standard Euclidean norm),norm(x, inf)
: $\infty$-norm of a vector (i.e. maximum absolute value),norm(x, p)
: p-norm of a vector.Among these four, the Euclidean 2-norm is by far the most popular and, more often than not, is what people mean when discussing the norm of a vector $\mathbf{x}$.
I've used the other three quite regularly as well but more from a convex optimization perspective. I hardly ever had to actually compute them but they're simple enough to implement and I believe are something everybody expects from any standard linear algebra module.
opnorm(A, 1)
: 1-norm of a matrix (i.e. maximum absolute column sum)opnorm(A, 2)
: 2-norm of a matrix (i.e. largest singular value)opnorm(A, inf)
: $\infty$-norm of a matrix (i.e. maximum absolute row sum)opnorm(A, "nuclear")
: nuclear norm (i.e. sum of singular values)opnorm(A, "fro")
: Frobenius norm.Like everyone, I use the Frobenius norm all the time. Similarly, the 2-norm and the nuclear norm are quite extensively used in model order reduction. As before, the 1-norm and the $\infty$-norm, although I don't quite know where they are being used, are some things everybody would expect from a standard linalg module.
There are a bunch of other matrix-related norms that I have used or seen being used over the years but they are more of a niche thing. These include:
schatten_norm(A, p)
: Schatten p-norm (i.e. the vector p-norm applied to the singular values of A
).lognorm(A, 2)
: Log-norm of a matrix, also known as its numerical abcissa and defined as $\lambda_{\max} ( \frac{\mathbf{A} + \mathbf{A}^H}{2})$.Among these two, the lognorm
is probably the one I've used most often. It is related to the non-normality of the matrix and proves important when studying linear time invariant systems of the form $\dot{x} = Ax + Bu$.
Here are my thoughts (in no particular order) on the matter:
scipy
notation, I kind of prefer the norm
/opnorm
dichotomy as it makes norm(x, 2)
and opnorm(x, 2)
very clear from just reading the code and not having to check the declarations to see if x
is rank-1 or rank-2 array.norm
and opnorm
should return the Euclidean and Frobenius norms by default, respectively.schatten_norm
and lognorm
could be added although they wouldn't be very high in my priority list.is_symmetric(A)
, is_hessenberg(A)
etc dispatching to the specialized solvers. If you know of any better/more fortranic way, I'm all ears.norm
to take only a rank-1 array as argument. This would make less code bloat to handle all the corner cases scipy
is handling (i.e. whether the norm is applied on the whole array, only along the rows, or only the columns). On the other hand, this would force the user to write a for
loop to compute the norm of a collection of vectors represented as a rank-2 array and would be a major departure from the numpy
/scipy
standard.I tend to think that being strict about the rank-1 array argument for norm
is a better scientific/programming practice than what scipy
currently offers. It doesn't leave any room for interpretation. Forcing the user to write a for
loop for a collection of vectors might also make the code more readable and less error-prone (although it may possibly cause some performance loss (?), I don't know).
@perazz : On a side note, do you have a publicly available roadmap for the development of the stdlib_linalg
module? I have seen some discussions on the fortran-lang discourse but nothing centralized. Translating/adapting everything scipy.linalg
is offering is too much for a single person, even a one-man team. I suppose however that we all have bits and pieces scattered around our different code bases (e.g. I have some sketch for schur
, expm
, hankel
, toeplitz
, etc).
@loiseaujc I basically agree with everything you said! I just have one question regarding the naming opnorm
, is it for "operator norm" as to say the matrix is an operator on a vector space? could there be a more "explicit" name to make the distinction? in any case, totally agree that it is good to have 2 well distinguished interfaces. If not, opnorm
is just fine.
Regarding your query about the type of matrix, I think stdlib has in the linalg module https://stdlib.fortran-lang.org/page/specs/stdlib_linalg.html there are several checks such https://stdlib.fortran-lang.org/page/specs/stdlib_linalg.html#is_hermitian-checks-if-a-matrix-is-hermitian, https://stdlib.fortran-lang.org/page/specs/stdlib_linalg.html#is_hessenberg-checks-if-a-matrix-is-hessenberg among others.
I tend to think that being strict about the rank-1 array argument for norm is a better scientific/programming practice than what scipy currently offers. It doesn't leave any room for interpretation. Forcing the user to write a for loop for a collection of vectors might also make the code more readable and less error-prone (although it may possibly cause some performance loss (?), I don't know).
Agree, and I don't think that in Fortran you would have a performance loss (that would be true in python). If the implementations and interfaces are well designed the compilers might even be able to properly vectorize nicely. One thing to consider though, would be to have at least dimensions 2, 3 and 4 with explicit unrolled implementations and then a generic one for d>4.
I've used norm
and opnorm
in the discussion because that is what is being used in Julia. opnorm
indeed stands for operator norm. While using norm
for vectors is pretty obvious, I don't have a strong opinion regarding opnorm
(although I don't have a better alternative to propose).
Thanks @loiseaujc for the detailed comments. I very mostly agree with your ideas and I would summarize the consensus so far:
norm
) and matrix (opnorm
) normsHere is some further thoughts I'm having about the design.
dim
specification. We should find a way to be sure our notation is not easily confused with this. character
input for all norms, instead of a numeric one? I'm saying so because I'd personally like more having to deal with something like norm(x, '2')
, norm(x, 'L2')
, norm(x, 'inf')
rather than having to do norm(x, ieee_value(0.0, ieee_positive_inf)))
. So I think the code would be clear for both the user and the developer: character(*), optional, intent(in) :: norm_type
select case (norm_type)
case ('2','Euclidean')
...
case ('inf','Inf','Infinity')
...
end select
optional
additional flag like is_hessenberg=.true.
or matrix_form='Hessenberg'
is probably the most performant design, because we can't really check if the matrix complies to any special properties at all times, it would be too expensive.
- the Fortran standard has norm2 that works for both arrays and matrices with the optional
dim
specification. We should find a way to be sure our notation is not easily confused with this.
I shall be able to start working on this by the end of next week. I'll start with the baseline implementation and will see from there how we can improve and make sure it is not confounded with norm2
.
- We should find a unique way to express the norm type. For example, could it be that we require a
character
input for all norms, instead of a numeric one? I'm saying so because I'd personally like more having to deal with something likenorm(x, '2')
,norm(x, 'L2')
,norm(x, 'inf')
rather than having to donorm(x, ieee_value(0.0, ieee_positive_inf)))
. So I think the code would be clear for both the user and the developer:character(*), optional, intent(in) :: norm_type select case (norm_type) case ('2','Euclidean') ... case ('inf','Inf','Infinity') ... end select
I like this.
- For the (later) specialized matrix norms, I would think that an
optional
additional flag likeis_hessenberg=.true.
ormatrix_form='Hessenberg'
is probably the most performant design, because we can't really check if the matrix complies to any special properties at all times, it would be too expensive.
Agreed. I prefer the matrix_form = "Hessenberg"
rather than is_hessenberg = .true.
. I may be naïve but the second would incur a lot of optional args (is_hessenberg
, is_symmetric
, is_triangular
, etc) of which only one can be set to .true.
. The matrix_form = string
seems easier to handle and if not specified, opnorm
defaults to the computation using the general matrix approach.
I prefer the
matrix_form = "Hessenberg"
Absolutely agree, and it would be more in line with the norm_type
argument
Here is a draft/hacky implementation of norm
I'd like to discuss a bit before drafting the PR.
#:for rk, rt, ri in RC_KINDS_TYPES
#:if rk !="xdp"
module function stdlib_linalg_norm_${ri}$(x, which) result(out)
!> Input vector x[n]
${rt}$, intent(in) :: x(:)
!> [optional] Which vector norm is being computed.
character(len=*), optional, intent(in) :: which
!> Norm of the vector.
real(${rk}$) :: out
!> Internal variables.
integer(ilp) :: n, idx
character(len=:), allocatable :: which_
!> Dimension of the vector.
n = size(x, kind=ilp)
!> Default: Euclidean 2-norm.
which_ = optval(which, "2") ; which_ = to_lower(which_)
!> Dispatch computation to the appropriate driver.
select case (which_)
case ("2", "euclidean", "l2")
out = nrm2(n, x, 1_ilp)
case ("1", "l1")
#:if rt[0] == "r"
out = stdlib_${rk[0]}$asum(n, x, 1_ilp)
#:elif rt[0] == "c" and rk[0] == "s"
out = stdlib_scsum1(n, x, 1_ilp)
#:elif rt[0] == "c" and rk[0] == "d"
out = stdlib_dzsum1(n, x, 1_ilp)
#:elif rt[0] == "c" and rk[0] == "q"
out = stdlib_qzsum1(n, x, 1_ilp)
#:endif
case ("inf")
#:if rt[0] == "r"
idx = stdlib_i${rk[0]}$amax(n, x, 1_ilp)
#:else
idx = stdlib_i${ri[0]}$max1(n, x, 1_ilp)
#:endif
out = abs(x(idx))
case default
stop
end select
end function stdlib_linalg_norm_${ri}$
#:endif
#:endfor
After having extended the interface for nrm2
for complex-valued vectors, the code compiles correctly. I haven't yet implemented the tests though, but I don't see any obvious errors. I do have a few questions though to polish it and make it more fypp-/stdlib-compliant. There are a couple of things I'm not quite sure how to handle properly:
which
is of type character(len=*)
, we would need to check that it does not correspond to a case already handled by the dispatch, that which
is one character-long (assuming we only consider integer $p$), convert the character to a float
(either single or double) and then compute $\sqrt[p]{\sum \vert x_i \vert^p}$. This seems unnecessarily complicated for a relatively edge case. A simple option however would be to have a dedicated stdlib_linalg_pnorm_${ri}$
function for which which
is actually a non-negative float
or integer
and put it under the same interface as `stdlib_linalgnorm${ri}$. What would be your take on this?fypp
-esque way to handle the different types? The problem I see is that while asum
is defined for both real
and complex
, sum1
only is defined for complex
. I hesitate on whether I should keep the code as it is or if we should create an interface for asum
, as well as one for sum1
which under the hood also calls asum
if x
is real. Basically something like this (don't pay attention to the syntax)interface asum
module function rsasum
module function csasum
...
end interface
and
interface sum1
module function rsasum
module function csum1
...
end interface
asum
for real-valued vectors would basically end up being referenced by two different interfaces. I'm not sure whether this is the cleanest thing to do although it would somehow make some sense from a mathematical point of view.
which
is none of the cases covered by the dispatch (e.g. user made a typo).I'll wait for your feedback at the moment and will start implementing the tests and documentation in the mean time.
A simple option however would be to have a dedicated stdlib_linalgpnorm${ri}$ function for which which is actually a non-negative float or integer and put it under the same interface as `stdlib_linalgnorm${ri}$. What would be your take on this?
This sounds good! another option, if you prefere to keep a single interface:
use stdlib_str2num, only: to_num
use stdlib_error, only: error_stop
...
case default
p = to_num( which_ , p )
if( p > 2 .and. p < p_limit ) then !> define a limit p value
!> compute the p-norm
else
call error_stop( "Invalid p-norm",code )
end if
If you prefere to have an interface that takes an integer p
value also exposed, I would then suggest to make this the base implementation that is also exposed to the user, and that the interface taking the character
"just" handles the conversion and then call the one taking an integer value.
Here is a draft/hacky implementation of
norm
I'd like to discuss a bit before drafting the PR.
looks good so far!
- At the moment, we have the ℓ2 norm (default), the ℓ1 norm as well as the infinity norm. In my original post, I discussed the possibility also to have a generic p-norm. Since
which
is of typecharacter(len=*)
, we would need to check that it does not correspond to a case already handled by the dispatch, thatwhich
is one character-long (assuming we only consider integer p), convert the character to afloat
(either single or double) and then compute ∑|xi|pp. This seems unnecessarily complicated for a relatively edge case. A simple option however would be to have a dedicatedstdlib_linalg_pnorm_${ri}$
function for whichwhich
is actually a non-negativefloat
orinteger
and put it under the same interface as `stdlib_linalgnorm${ri}$. What would be your take on this?
What I like the most is to have a separate function to do the option parsing, so the input could be either integer
or character
, as @jalvesz suggested. Something like:
pure subroutine parse_norm(which, norm_type, err)
character(*), intent(in) :: which
integer(ilp), intent(out) :: norm_type
type(linalg_state_type), intent(out) :: err
select case (to_lower(which))
case ("euclidean","2","l2")
norm_type = NORM_TYPE_2
case ("inf","infinity")
norm_type = NORM_TYPE_INF
case default
p = to_num(which, p)
if (p>2 .and. p<=NORM_TYPE_MAX) then
norm_type = p
else
norm_type = NORM_TYPE_ERROR
err = linalg_state_type("norm",LINALG_VALUE_ERROR,"invalid norm type: ",which)
endif
end select
end subroutine
The rule could be to define parameters that are >1 for p-norms and <=0 for ther norms, i.e.:
integer(ilp), parameter :: NORM_TYPE_ERROR = -huge(0_ilp)
integer(ilp), parameter :: NORM_TYPE_FROB = -1
integer(ilp), parameter :: NORM_TYPE_INF = -2
integer(ilp), parameter :: NORM_TYPE_L1 = 1
integer(ilp), parameter :: NORM_TYPE_L2 = 2
integer(ilp), parameter :: NORM_TYPE_MAX = 100
- Second point has to do with the 1-norm computation. Is there a more
fypp
-esque way to handle the different types? The problem I see is that whileasum
is defined for bothreal
andcomplex
,sum1
only is defined forcomplex
. I hesitate on whether I should keep the code as it is or if we should create an interface forasum
, as well as one forsum1
which under the hood also callsasum
ifx
is real. Basically something like this (don't pay attention to the syntax)
For the BLAS/LAPACK functions that have different names for real/complex types, I've been using inline fypp declarations, i.e.:
- Last point finally is that I'm not sure how to properly handle the case where
which
is none of the cases covered by the dispatch (e.g. user made a typo).
You can look at the linalg_state_type
model, that is the common way that all linear algebra routines handle exceptions, this is typically an optional, intent(out)
argument that is internally handlead by a final call to:
This forces the code to either return an error code (hopefully LINALG_SUCCESS
) to the user, or stop the program on an exception.
Motivation
I'm currently updating my package
LightKrylov
to make use as much as possible of the new linalg features offered in thestdlib_linalg
module. Among the things I'll need eventually are functions to compute matrix norms. Is there any on-going work at the moment from @perazz, @jvdp1, or @jalvesz on this subject or could I make this my first task withstdlib
?Prior Art
scipy.linalg.norm(A, ord=None, axis=None, keepdims=False, check_finite=True)
. It handles both standard vector norms as well as a variety of vector-induced and non-induced matrix norms.LinearAlgebra.norm(x, p)
wherex
is ann
-vector returns the $p$-norm of this vector.LinearAlgebra.norm(A, p)
whereA
is anm x n
matrix returns the "entry-wise" $p$-norm ofA
.LinearAlgebra.opnorm(A, p)
whereA
is anm x n
matrix returns the vector-induced $p$-norm ofA
.Additional Information
While most people might be accustomed to the SciPy standard, my personal preference would still go to the Julia principle of separating the true vector-induced norms (i.e. 1-norm, 2-norm, $\infty$-norm) from the entrywise ones (e.g. Frobenius) although I'm obviously open to discussion.