perazz / fortran-lapack

Modularized Fortran LAPACK implementation
BSD 3-Clause "New" or "Revised" License
34 stars 3 forks source link

API style definition #44

Open perazz opened 8 months ago

perazz commented 8 months ago

See https://fortran-lang.discourse.group/t/linear-algebra-api-call-for-feedback/7455/3

Ideally, we would like for each API operation:

The function and operator versions should be made pure where possible

PierUgit commented 8 months ago

Excellent approach!

I'm wondering if the function versions should have an error handler at all, or just abort in case of any error? The problem with the error handler is that the functions can't be pure (or event better once F2023 will be widely implemented, simple), thus preventing some compiler optimizations.

everythingfunctional commented 8 months ago

Here's how I would recommend to structure the API, if you are set on having a function version with an error argument.

The subroutine version should use the present tense verb form for the name (i.e. solve), while the function version should use the past tense verb or noun form (i.e. solved). Thus the interfaces would look something like the following.

interface invert
  module procedure invert_sp
  ...
end interface

interface inverse
  module procedure inverse_sp_with_err
  module procedure inverse_sp_pure
  ...
end interface

interface operator(.invert.)
  module procedure inverse_sp_pure
  ...
end interface

interface
  pure module subroutine invert_sp(A, err)
    real, intent(inout) :: A(:,:)
    type(error), intent(out), optional :: err
  end subroutine
end interface
contains
function inverse_sp_with_err(A, err) result(inverted)
  real, intent(in) :: A(:,:)
  type(error), intent(out) :: err
  real, allocatable :: inverted(:,:)
  inverted = A
  call invert(inverted, err)
end function
pure function inverse_sp_pure(A) result(inverted)
  real, intent(in) :: A(:,:)
  real, allocatable :: inverted(:,:)
  inverted = A
  call invert(inverted)
end function

That way you only have to implement the algorithm once, in the subroutine, and using error stop it can be pure. For the people that are trying to be memory efficient or handle error conditions, they can call the subroutine. For the people that like to live dangerously, they can use the functional version with the error argument. For the people that want purity they can use the functional version without the error argument, or the operator.

jalvesz commented 8 months ago

Here just a few ideas around what has been said: @perazz

a subroutine version, that performs no internal allocations, and intent(inout) arguments where necessary

@everythingfunctional

a subroutine version that (optionally) performs no internal allocations

So, I think that more that "no internal allocations", that can be needed for internal working arrays, for me, the point is that a procedure meant to crunch values but not reshape them (not expected to modify the data layout) should not reclaim property of the data to maximize performance in terms of allocations... Unless the user wants to use a dynamical version that can be built on top of the reference one.

Just to show the idea, take this toy example for inverting a 2x2 matrix:

module linalg
    use iso_fortran_env, only: dp=>real64
    implicit none
    private

    public :: invert, inverse

    interface invert
        module procedure invert_base_dp
        module procedure invert_dp
    end interface

    interface inverse
        module procedure inverse_dp
    end interface

contains

    subroutine invert_base_dp(A,Ainv) !> base implementation with no allocations
        integer, parameter :: wp = dp
        real(wp), intent(in) :: A(:,:)
        real(wp), intent(inout) :: Ainv(:,:)

        real(wp) :: det
        !-----------------------------------
        if(size(A,dim=1).ne.2)return
        det = 1._wp/(A(1,1)*A(2,2)-A(1,2)*A(2,1))
        Ainv(1,1) =  det*A(2,2) ; Ainv(1,2) = -det*A(1,2)
        Ainv(2,1) = -det*A(2,1) ; Ainv(2,2) =  det*A(1,1)
    end subroutine

    subroutine invert_dp(A) !> in-place
        integer, parameter :: wp = dp
        real(wp), intent(inout) :: A(:,:)

        real(wp), allocatable :: Atemp(:,:)
        !----------------------------------
        allocate(Atemp(size(A,dim=1),size(A,dim=2)))
        call invert_base_dp(A,Atemp)
        A = Atemp
    end subroutine

    function inverse_dp(A) result(Ainv)
        integer, parameter :: wp = dp
        real(wp), intent(inout) :: A(:,:)
        real(wp), allocatable :: Ainv(:,:)

        allocate(Ainv(size(A,dim=1),size(A,dim=2)))
        call invert_base_dp(A,Ainv)
    end function

end module

program main
use linalg
real(8) :: A(2,2), B(2,2)
A(1,1) = 1; A(1,2) = 2
A(2,1) = 3; A(2,2) = 4

base : block ! The user shall give the required buffers because he knows he wants to recycle it
    real(8) :: B(2,2)
    print *, 'calling base implementation using static data'
    call invert(A,B)
    print *, B
end block base

inplace : block ! The user just wants to handle his matrix data and doesnt care of keeping the original matrix
                        ! internal allocation will happen to provide in-place inversion
    real(8) :: B(2,2)
    B = A
    print *, 'calling in-place inversion'
    call invert(B)
    print *, B
end block inplace

functional : block
    real(8), allocatable :: B(:,:)
    real(8) :: C(2,2)
    print *, 'calling functional inversion'
    B = inverse(A)
    C = inverse(B)
    print *, B
    print *, C
end block functional

end program

Then, I think that the function interface is nice in style and spirit, but I'm not sure if it would be so much of a priority. Or, maybe functions can be used simply for returning the error handler ?

perazz commented 8 months ago

What constrains us is the LAPACK backend, that destroys input data in most (not all) of our use cases. Made sense, in the old days where memory was very expensive.

So for example in your inversion example, we will still need to create a copy of the original data anyways:

invert_base_dp(A,Ainv) !> base implementation with no allocations
![...]
Ainv = A
call getrf(...,Ainv,...)

which makes the subroutine version exactly equivalent to the function one, except for storage that has to be allocated outside of the subroutine