fortran-lang / stdlib

Fortran Standard Library
https://stdlib.fortran-lang.org
MIT License
1.05k stars 164 forks source link

Implement cubic splines interpolation #100

Open certik opened 4 years ago

certik commented 4 years ago

When some 1D data is given on a grid, and nothing more is known about it, then cubic splines are one of the best methods to interpolate it. It's relatively high order (compared to linear interpolation), so the result is smooth, but is not too high, so the result is not wiggly. In some sense, it's the optimum.

An example implementation: https://github.com/certik/fortran-utils/blob/b43bd24cd421509a5bc6d3b9c3eeae8ce856ed88/src/splines.f90

rweed commented 4 years ago

@certik, I have a version of the Fritsch/Carlson/Butland monotone cubic interpolation routines (PCHIP) that I refactored to modern Fortran (ie removed sphagetti code) from the original SLATEC code I can upload. Don't remember what the SLATEC license is though. Its set up to use REAL64 by default but I made buidling a REAL32 version a command line define option. I'll upload here if you want it.

jacobwilliams commented 4 years ago

I did the same thing! https://github.com/jacobwilliams/PCHIP

certik commented 4 years ago

Well, let's join the forces!

On Wed, Jan 8, 2020, at 9:00 PM, Jacob Williams wrote:

I did the same thing! https://github.com/jacobwilliams/PCHIP

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/fortran-lang/stdlib/issues/100?email_source=notifications&email_token=AAAFAWHZVREA36NYIBSUSMTQ42ONFA5CNFSM4KENV3W2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIO3VHQ#issuecomment-572373662, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAFAWBQ5BAEOFL4UXKZ6WTQ42ONFANCNFSM4KENV3WQ.

ivan-pi commented 4 years ago

Intel MKL library has a Fortran 90 interface to a set of functions for data fitting. The API is however task-based and not that intuitive compared to Python's or MATLAB's syntax. (I suppose this is for performance purposes if you have many repeated calls using the same set of knots.)

If we define the (high) stdlib level API would it be possible to use submodules to have both a custom implementation/refactored SLATEC and a second backend calling the Intel MKL routines?

I know the SciPy library supports using Intel MKL in some of the routines, so it should be possible to do it here as well.

certik commented 4 years ago

Yes, we should try to optionally use MKL also.

On Tue, Jan 14, 2020, at 1:39 AM, Ivan wrote:

Intel MKL library has a Fortran 90 interface to a set of functions for data fitting https://software.intel.com/en-us/mkl-developer-reference-fortran-data-fitting-computational-routines#FC83C653-2C9E-481A-BAF9-588A6AA544BF. The API is however task-based and not that intuitive compared to Python's or MATLAB's syntax. (I suppose this is for performance purposes if you have many repeated calls using the same set of knots.)

If we define the (high) stdlib level API would it be possible to use submodules to have both a custom implementation/refactored SLATEC and a second backend calling the Intel MKL routines?

I know the SciPy library supports using Intel MKL in some of the routines, so it should be possible to do it here as well.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/fortran-lang/stdlib/issues/100?email_source=notifications&email_token=AAAFAWDSC2A7CNKYCLLCCIDQ5V22XA5CNFSM4KENV3W2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI3YPWQ#issuecomment-574064602, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAFAWH2W7WZFDFEI7GGWF3Q5V22XANCNFSM4KENV3WQ.

igirault commented 4 months ago

I see this problem has been left aside, so I would interested to handle it.

Here is a general proposal to be refined:

Splines are piecewise polynomials. So the first step would be to define a derived type spline_t representing a spline, that would be analogous to the scipy.interpolate class PPoly.

Here is what the most simplistic definition of spline_t would look like:

type spline_t
    private
    !> coefficients of the polynomial on each intervals
    real(dp), allocatable :: c(:, :)
    !> breakpoints defining the intervals
    real(dp), allocatable :: x(:)
    !> type of extrapolation when the spline evaluated outside its range of definition
    integer :: extrapolation_type
contains
    private
    procedure, public :: evaluate
end type

where the type-bound procedure evaluate allows to evaluate the spline at several points xnew:

pure function evaluate(self, xnew, extrapolation_type) result(ynew)
    class(spline_t), intent(in) :: self
    real(dp), intent(in) :: xnew(:)
    integer, optional :: extrapolation_type
    real(dp), allocatable :: ynew(:)
end function

The optional argument extrapolation_type to the procedure would enable to override the value of self%interpolation_type if necessary.

Then some other type-procedures could be implemented to compute the derivatives, integrate the spline, etc ... as in scipy's PPoly.

After that, any spline interpolation method (linear, cubic, pchip, etc) could be implemented by defining a function that takes the data to interpolate and returns the appropriate instance of spline_t. For any method, this function would have the following base interface

function get_XXX_spline(x, y, extrapolation_type) result(spline)
    real(dp), intent(in) :: x(:)
    real(dp), intent(in) :: y(:)
    integer, intent(in), optional :: extrapolation_type
    type(spline_t) :: spline
end function

with eventually aditionnal parameters depending on the interpolation method (bc_type for cubic spline for instance).

I omitted for now an equivalent of the scipy parameter axis, to evaluate a spline along one dimension of an array xnew whose rank is superior to 1. But this can be added if felt necessary.

What do you think ?

jvdp1 commented 4 months ago

Thank you @igirault for this proposal.

So the first step would be to define a derived type spline_t representing a spline, that would be analogous to the scipy.interpolate class PPoly.

Is a DT really needed? Would it be also possible to provide API that return the results in (multiple) arrays (e.g., as proposed in fortran-utils? A DT API could thereafter provided on top of them. Such a strategy would allow the user to use the approach that suits best his/her aim.

igirault commented 4 months ago

Is a DT really needed? Would it be also possible to provide API that return the results in (multiple) arrays (e.g., as proposed in fortran-utils? A DT API could thereafter provided on top of them. Such a strategy would allow the user to use the approach that suits best his/her aim.

Of course the DT API is not mandatory, but here are the advantages of exposing such API rather than a procedural one in my opinion:

Now of course if people feel that a procedural API is preferable, we can work in that direction first. Basically, the procedural API I would propose would follow the DT API described above, with subroutines either computing the members of spline_t or taking them as input.

jvdp1 commented 4 months ago

Thank you @igirault for your comment and answer. I agree with the advantages of DT API. Procedural API have been favored in stdlib whenever it was possible, with DT API built on top of them if possible. See e.g., #798 for such a similar strategy (although DT API are not provided). I think this is a nice case for such a strategy: to propose to users both procedural and DT API. Do you think that such an approach would be possible?

@fortran-lang/stdlib @perazz @certik How do you think about this?

perazz commented 4 months ago

I see the linked discussion at #103, I believe priority should be given to how these ideas were originally outlined.

That said, I think that derived type + functional (non-polymorphic) interface would be fastest, and most maintainable long term. Because the type-bound version comes basically for free, I don't see why it should not be an option as well. You may also want to derive some ideas from my fitpack port, that scipy uses for 1D and 2D interpolation.

igirault commented 4 months ago

Thank you @jvdp1 and @perazz for your answers

I think this is a nice case for such a strategy: to propose to users both procedural and DT API. Do you think that such an approach would be possible?

This is absolutely possible, I was just not aware that we were aiming at providing both types of API.

So here is a proposal for a procedural API, that will be easily wrapped in the above-described OO API. Suggestions for better procedure/arguments names are welcome.

General spline procedures

Those procedures are indepedant from the kind of the spline. The most important one is:

subroutine spline_evaluate(c, xi, x, y, nu, extrapolate)
  !> the parameters of the spline
  real(dp), intent(in) :: c(:, :)
  !> the breakpoints of the spline
  real(dp), intent(in) :: xi(:)
  !> the interpolation points
  real(dp), intent(in) :: x(:)
  !> the values to compute by interpolation
  real(dp), intent(out) :: y(:)
  !> order of the derivative to evaluate, by default nu = 0
  integer, intent(in), optional :: nu
  !> extrapolate : 0=no (default), 1=yes, 2=periodic
  integer, intent(in), optional :: extrapolate
end subroutine

Compared to fortran-utils' implementation, the extrapolate argument has been added to control the behavior of the procedure for points outside of the definition range, following scipy' API.

Another procedure could be defined to integrate the spline.

Procedures to compute the parameters of the spline

We shall start by something close to fortran-utils that implements cubic spline interpolation.

Here is a proposition:

subroutine spline3_get(x, y, c, bc_type, bc_val)
  !> the data points
  real, intent(in) :: x(:)
  !> the data values
  real, intent(in) :: y(:)
  !> the paramaters characterizing the spline
  real, intent(out) :: c(:, :)
  !> type of boundary condition at each end: bc_type(1) = type at left end, bc_type(2) = type at right end.
  !! 1 = specified 2nd derivative, 2 = 2nd derivative consistent with interpolating cubic (default).
  integer, intent(in), optional :: bc_type(2)
  ! bc_val(1) = value at left end, bc_val(2) = value at right end. bc_val=0.0 by default
  real, intent(in), optional :: bc_val(2)
end subroutine

In this proposal, the optional arguments match the options available in fortran-utils' spline3pars to control boundary conditions. But more options could be added following scipy's CubicSpline.

If other spline methods are implemented later on, similar subroutines XXX_get could be defined.