fortran-lang / stdlib

Fortran Standard Library
https://stdlib.fortran-lang.org
MIT License
1.02k stars 161 forks source link

Adding matrix norms to the `stdlib_linalg` module. #820

Open loiseaujc opened 1 month ago

loiseaujc commented 1 month ago

Motivation

I'm currently updating my package LightKrylov to make use as much as possible of the new linalg features offered in the stdlib_linalg module. Among the things I'll need eventually are functions to compute matrix norms. Is there any on-going work at the moment from @perazz, @jvdp1, or @jalvesz on this subject or could I make this my first task with stdlib ?

Prior Art

Additional Information

While most people might be accustomed to the SciPy standard, my personal preference would still go to the Julia principle of separating the true vector-induced norms (i.e. 1-norm, 2-norm, $\infty$-norm) from the entrywise ones (e.g. Frobenius) although I'm obviously open to discussion.

perazz commented 1 month ago

Great idea @loiseaujc. Rn I'm trying to tackle the decompositions (pseudo-inverse, Cholesky, QR) and I was planning to address norms and condition number next. So, your contribution would be very welcome!

As a way to separate between the two approaches to norm, would there be a way to use the same norm interface for all, but with different arguments maybe? I was loosely thinking that something like norm(A,2) vs. norm(A,'Frobenius') could work.

loiseaujc commented 1 month ago

We probably could. Julia is more exhaustive than SciPy for this. Having norm and opnorm allows them to have both the element wise p-norm (for abitrary p) and the induced one (for p=1, 2 and infinity). SciPy is restricted to the standard operator induced norms + the Frobenius one. I think it mostly depends on how exhaustive we want to be (and how often the non-Frobenius element wise norms are actually being used which I think is not much).

The more I think about it, the more it feels like the Julia way may actually cover more use-cases than are actually used in practice. I still like the difference between norm and opnorm though just to recall the user that these are fundamentally different mathematical objects. Having the two interfaces might also make it more transparent when a 2D array represent an actual matrix or a collection of vectors instead of having to play with the axis keyword.

loiseaujc commented 1 month ago

I took some time to think about all the different norms (both for vectors and matrices) I've ever needed in my scientific computing adventure over the past 15 years or so. Here is a pretty standard list (+ whatever scipy/Julia are offering).

Vector norms

Among these four, the Euclidean 2-norm is by far the most popular and, more often than not, is what people mean when discussing the norm of a vector $\mathbf{x}$.

I've used the other three quite regularly as well but more from a convex optimization perspective. I hardly ever had to actually compute them but they're simple enough to implement and I believe are something everybody expects from any standard linear algebra module.

Matrix norms

Like everyone, I use the Frobenius norm all the time. Similarly, the 2-norm and the nuclear norm are quite extensively used in model order reduction. As before, the 1-norm and the $\infty$-norm, although I don't quite know where they are being used, are some things everybody would expect from a standard linalg module.

There are a bunch of other matrix-related norms that I have used or seen being used over the years but they are more of a niche thing. These include:

Among these two, the lognorm is probably the one I've used most often. It is related to the non-normality of the matrix and proves important when studying linear time invariant systems of the form $\dot{x} = Ax + Bu$.


Here are my thoughts (in no particular order) on the matter:

I tend to think that being strict about the rank-1 array argument for norm is a better scientific/programming practice than what scipy currently offers. It doesn't leave any room for interpretation. Forcing the user to write a for loop for a collection of vectors might also make the code more readable and less error-prone (although it may possibly cause some performance loss (?), I don't know).

@perazz : On a side note, do you have a publicly available roadmap for the development of the stdlib_linalg module? I have seen some discussions on the fortran-lang discourse but nothing centralized. Translating/adapting everything scipy.linalg is offering is too much for a single person, even a one-man team. I suppose however that we all have bits and pieces scattered around our different code bases (e.g. I have some sketch for schur, expm, hankel, toeplitz, etc).

jalvesz commented 1 month ago

@loiseaujc I basically agree with everything you said! I just have one question regarding the naming opnorm, is it for "operator norm" as to say the matrix is an operator on a vector space? could there be a more "explicit" name to make the distinction? in any case, totally agree that it is good to have 2 well distinguished interfaces. If not, opnorm is just fine.

Regarding your query about the type of matrix, I think stdlib has in the linalg module https://stdlib.fortran-lang.org/page/specs/stdlib_linalg.html there are several checks such https://stdlib.fortran-lang.org/page/specs/stdlib_linalg.html#is_hermitian-checks-if-a-matrix-is-hermitian, https://stdlib.fortran-lang.org/page/specs/stdlib_linalg.html#is_hessenberg-checks-if-a-matrix-is-hessenberg among others.

I tend to think that being strict about the rank-1 array argument for norm is a better scientific/programming practice than what scipy currently offers. It doesn't leave any room for interpretation. Forcing the user to write a for loop for a collection of vectors might also make the code more readable and less error-prone (although it may possibly cause some performance loss (?), I don't know).

Agree, and I don't think that in Fortran you would have a performance loss (that would be true in python). If the implementations and interfaces are well designed the compilers might even be able to properly vectorize nicely. One thing to consider though, would be to have at least dimensions 2, 3 and 4 with explicit unrolled implementations and then a generic one for d>4.

loiseaujc commented 1 month ago

I've used norm and opnorm in the discussion because that is what is being used in Julia. opnorm indeed stands for operator norm. While using norm for vectors is pretty obvious, I don't have a strong opinion regarding opnorm (although I don't have a better alternative to propose).

perazz commented 1 month ago

Thanks @loiseaujc for the detailed comments. I very mostly agree with your ideas and I would summarize the consensus so far:

Here is some further thoughts I'm having about the design.

character(*), optional, intent(in) :: norm_type
select case (norm_type)
   case ('2','Euclidean')
      ...
   case ('inf','Inf','Infinity')
     ...
end select   
loiseaujc commented 1 month ago
  • the Fortran standard has norm2 that works for both arrays and matrices with the optional dim specification. We should find a way to be sure our notation is not easily confused with this.

I shall be able to start working on this by the end of next week. I'll start with the baseline implementation and will see from there how we can improve and make sure it is not confounded with norm2.

  • We should find a unique way to express the norm type. For example, could it be that we require a character input for all norms, instead of a numeric one? I'm saying so because I'd personally like more having to deal with something like norm(x, '2'), norm(x, 'L2'), norm(x, 'inf') rather than having to do norm(x, ieee_value(0.0, ieee_positive_inf))). So I think the code would be clear for both the user and the developer:
character(*), optional, intent(in) :: norm_type
select case (norm_type)
   case ('2','Euclidean')
      ...
   case ('inf','Inf','Infinity')
     ...
end select   

I like this.

  • For the (later) specialized matrix norms, I would think that an optional additional flag like is_hessenberg=.true. or matrix_form='Hessenberg' is probably the most performant design, because we can't really check if the matrix complies to any special properties at all times, it would be too expensive.

Agreed. I prefer the matrix_form = "Hessenberg" rather than is_hessenberg = .true.. I may be naïve but the second would incur a lot of optional args (is_hessenberg, is_symmetric, is_triangular, etc) of which only one can be set to .true.. The matrix_form = string seems easier to handle and if not specified, opnorm defaults to the computation using the general matrix approach.

perazz commented 1 month ago

I prefer the matrix_form = "Hessenberg"

Absolutely agree, and it would be more in line with the norm_type argument

loiseaujc commented 1 week ago

Here is a draft/hacky implementation of norm I'd like to discuss a bit before drafting the PR.

    #:for rk, rt, ri in RC_KINDS_TYPES
    #:if rk !="xdp"
    module function stdlib_linalg_norm_${ri}$(x, which) result(out)
        !> Input vector x[n]
        ${rt}$, intent(in) :: x(:)
        !> [optional] Which vector norm is being computed.
        character(len=*), optional, intent(in) :: which
        !> Norm of the vector.
        real(${rk}$) :: out

        !> Internal variables.
        integer(ilp) :: n, idx
        character(len=:), allocatable :: which_

        !> Dimension of the vector.
        n = size(x, kind=ilp)

        !> Default: Euclidean 2-norm.
        which_ = optval(which, "2") ; which_ = to_lower(which_)

        !> Dispatch computation to the appropriate driver.
        select case (which_)
            case ("2", "euclidean", "l2")
                out = nrm2(n, x, 1_ilp)
            case ("1", "l1")
                #:if rt[0] == "r"
                out = stdlib_${rk[0]}$asum(n, x, 1_ilp)
                #:elif rt[0] == "c" and rk[0] == "s"
                out = stdlib_scsum1(n, x, 1_ilp)
                #:elif rt[0] == "c" and rk[0] == "d"
                out = stdlib_dzsum1(n, x, 1_ilp)
                #:elif rt[0] == "c" and rk[0] == "q"
                out = stdlib_qzsum1(n, x, 1_ilp)
                #:endif
            case ("inf")
                #:if rt[0] == "r"
                idx = stdlib_i${rk[0]}$amax(n, x, 1_ilp)
                #:else
                idx = stdlib_i${ri[0]}$max1(n, x, 1_ilp)
                #:endif
                out = abs(x(idx))
            case default
                stop
        end select
    end function stdlib_linalg_norm_${ri}$

    #:endif
    #:endfor

After having extended the interface for nrm2 for complex-valued vectors, the code compiles correctly. I haven't yet implemented the tests though, but I don't see any obvious errors. I do have a few questions though to polish it and make it more fypp-/stdlib-compliant. There are a couple of things I'm not quite sure how to handle properly:

interface asum
    module function rsasum
    module function csasum
    ...
end interface

and

interface sum1
    module function rsasum
    module function csum1
    ...
end interface

asum for real-valued vectors would basically end up being referenced by two different interfaces. I'm not sure whether this is the cleanest thing to do although it would somehow make some sense from a mathematical point of view.

I'll wait for your feedback at the moment and will start implementing the tests and documentation in the mean time.

jalvesz commented 1 week ago

A simple option however would be to have a dedicated stdlib_linalgpnorm${ri}$ function for which which is actually a non-negative float or integer and put it under the same interface as `stdlib_linalgnorm${ri}$. What would be your take on this?

This sounds good! another option, if you prefere to keep a single interface:

use stdlib_str2num, only: to_num
use stdlib_error, only: error_stop
...
case default
    p = to_num( which_ , p ) 
    if( p > 2 .and. p < p_limit ) then !> define a limit p value
       !> compute the p-norm
    else
       call error_stop( "Invalid p-norm",code )
    end if

If you prefere to have an interface that takes an integer p value also exposed, I would then suggest to make this the base implementation that is also exposed to the user, and that the interface taking the character "just" handles the conversion and then call the one taking an integer value.

perazz commented 6 days ago

Here is a draft/hacky implementation of norm I'd like to discuss a bit before drafting the PR.

looks good so far!

  • At the moment, we have the ℓ2 norm (default), the ℓ1 norm as well as the infinity norm. In my original post, I discussed the possibility also to have a generic p-norm. Since which is of type character(len=*), we would need to check that it does not correspond to a case already handled by the dispatch, that which is one character-long (assuming we only consider integer p), convert the character to a float (either single or double) and then compute ∑|xi|pp. This seems unnecessarily complicated for a relatively edge case. A simple option however would be to have a dedicated stdlib_linalg_pnorm_${ri}$ function for which which is actually a non-negative float or integer and put it under the same interface as `stdlib_linalgnorm${ri}$. What would be your take on this?

What I like the most is to have a separate function to do the option parsing, so the input could be either integer or character, as @jalvesz suggested. Something like:

pure subroutine parse_norm(which, norm_type, err)
    character(*), intent(in) :: which
    integer(ilp), intent(out) :: norm_type
    type(linalg_state_type), intent(out) :: err

   select case (to_lower(which))
        case ("euclidean","2","l2")
              norm_type = NORM_TYPE_2
        case ("inf","infinity")
              norm_type = NORM_TYPE_INF
        case default
              p = to_num(which, p)
              if (p>2 .and. p<=NORM_TYPE_MAX) then 
                  norm_type = p
              else
                  norm_type = NORM_TYPE_ERROR
                  err = linalg_state_type("norm",LINALG_VALUE_ERROR,"invalid norm type: ",which)
              endif
    end select
end subroutine 

The rule could be to define parameters that are >1 for p-norms and <=0 for ther norms, i.e.:

integer(ilp), parameter :: NORM_TYPE_ERROR = -huge(0_ilp)
integer(ilp), parameter :: NORM_TYPE_FROB = -1
integer(ilp), parameter :: NORM_TYPE_INF = -2
integer(ilp), parameter :: NORM_TYPE_L1 = 1
integer(ilp), parameter :: NORM_TYPE_L2 = 2
integer(ilp), parameter :: NORM_TYPE_MAX = 100
  • Second point has to do with the 1-norm computation. Is there a more fypp-esque way to handle the different types? The problem I see is that while asum is defined for both real and complex, sum1 only is defined for complex. I hesitate on whether I should keep the code as it is or if we should create an interface for asum, as well as one for sum1 which under the hood also calls asum if x is real. Basically something like this (don't pay attention to the syntax)

For the BLAS/LAPACK functions that have different names for real/complex types, I've been using inline fypp declarations, i.e.:

https://github.com/fortran-lang/stdlib/blob/d996e436d38d8aa8b47749cc3d93d02793d9ae2f/src/stdlib_linalg_svd.fypp#L266

  • Last point finally is that I'm not sure how to properly handle the case where which is none of the cases covered by the dispatch (e.g. user made a typo).

You can look at the linalg_state_type model, that is the common way that all linear algebra routines handle exceptions, this is typically an optional, intent(out) argument that is internally handlead by a final call to:

https://github.com/fortran-lang/stdlib/blob/d996e436d38d8aa8b47749cc3d93d02793d9ae2f/src/stdlib_linalg_svd.fypp#L287

This forces the code to either return an error code (hopefully LINALG_SUCCESS) to the user, or stop the program on an exception.