Open ivan-pi opened 2 years ago
Given how many times this has been re-implemented both in Fortran and other languages, I suspect that finding an interface which meets everyone's needs won't be easy. Hopefully we can reach an 80/20 compromise.
Here are some numbered items to help guide the discussion:
x
and y
argumentsy = p(0)
.Regarding the interface, a function with intent(out) arguments should be a subroutine IMO.
Often the user will want to fit successively higher polynomial orders and choose the best order using an information criterion such as AIC. Ideally such fits would be done efficiently (there are special algorithms for successively adding predictors).
Least-squares regression by successively adding general basis functions (here they are powers) is sometimes done.
How about this as a simple interface?
program polyfit_example
implicit none
interface
subroutine polyfit(x,y,n,p) bind(c,name="c_arma_polyfit")
use iso_c_binding, only: c_double, c_int
real(c_double), intent(in), contiguous :: x(:)
real(c_double), intent(in), contiguous :: y(:)
integer(c_int), intent(in), value :: n
real(c_double), intent(out) :: p(:)
end subroutine
end interface
integer, parameter :: dp = kind(1.0d0)
real(dp) :: x(10), y(10)
real(dp) :: p(3)
integer :: i
x = [((i-1)*0.2_dp,i=1,10)]
y = x**2 - 2.0_dp*x + 3.0_dp
call polyfit(x,y,2,p)
print *, "coefficients = ", p
print *, all(abs(p - [1.0_dp,-2.0_dp,3.0_dp]) < 100*epsilon(1.0_dp))
end program
For the implementation I just wrote a little wrapper of the polyfit
function in the Armadillo C++ library:
Here's an alternative implementation using Eigen:
Often the user will want to fit successively higher polynomial orders and choose the best order using an information criterion such as AIC. Ideally such fits would be done efficiently (there are special algorithms for successively adding predictors).
Least-squares regression by successively adding general basis functions (here they are powers) is sometimes done.
This sounds like something the qrupdate library could be used for. I believe this library is used in the Octave qrupdate
routine. It was written by Jaroslav Hajek, a computing expert from the Aeronautical Research and Test Institute (VZLU) in Prague. A GitLab mirror for qrupdate can be found here.
A prototype Fortran implementation I wrote 3 years ago can be found here: https://gist.github.com/ivan-pi/0fd517048c415ceca441eac626bf24c9
cc @arjenmarkus, @nshaffer, @jvdp1
We need a few more voices to move this issue forward. Given how many polynomial fitting codes are out there, it looks like polynomial regression is quite popular among practitioners.
Do you prefer the simple interface or the complex one?
! simple
subroutine polyfit(x,y,order,p)
real(wp), intent(in) :: x(:), y(:)
integer(ip), intent(in) :: order
real(wp), intent(out) :: p(:)
end subroutine
! complex
subroutine polyfit(x, y, order, p, rcond, rank, singular_values)
real(wp), intent(in) :: x(:), y(:)
integer(ip), intent(in) :: order
real(wp), intent(out) :: p(:)
real(wp), intent(in), optional :: rcond
integer(wp), intent(out), optional :: rank
real(wp), intent(out), allocatable, optional :: singular_values(:)
end subroutine
Let me read the mail thread (again) and I will get back to your question tomorrow.
Op wo 1 mrt 2023 om 14:44 schreef Ivan Pribec @.***>:
cc @arjenmarkus https://github.com/arjenmarkus, @nshaffer https://github.com/nshaffer, @jvdp1 https://github.com/jvdp1
We need a few more voices to move this issue forward. Given how many polynomial fitting codes are out there, it looks like polynomial regression is quite popular among practitioners.
Do you prefer the simple interface or the complex one?
! simple subroutine polyfit(x,y,n,p) real(wp), intent(in) :: x(:) real(wp), intent(in) :: y(:) integer(ip), intent(in) :: n real(wp), intent(out) :: p(:) end subroutine ! complex
subroutine polyfit(x, y, deg, p, rcond, rank, singular_values) real(wp), intent(in) :: x(:), y(:) integer(ip), intent(in) :: deg real(wp), intent(out) :: p(:) real(wp), intent(in), optional :: rcond integer(wp), intent(out), optional :: rank real(wp), intent(out), allocatable, optional :: singular_values(:) end subroutine
— Reply to this email directly, view it on GitHub https://github.com/fortran-lang/stdlib/issues/601#issuecomment-1450175284, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAN6YRZGBXTZDOB2D7LKORLWZ5HENANCNFSM5KDTWO2A . You are receiving this because you were mentioned.Message ID: @.***>
I would opt for the extended interface: it provides just that more output that might otherwise be forgotten about. For instance, the possibility of getting back the condition number means that people are aware that such a thing as the condition number is of importance in fitting problems, even if they will not use it. By the way, I did not quite understand the argument "singular_values" - does this have to do with the SVD?
Op wo 1 mrt 2023 om 16:17 schreef Arjen Markus @.***>:
Let me read the mail thread (again) and I will get back to your question tomorrow.
Op wo 1 mrt 2023 om 14:44 schreef Ivan Pribec @.***>:
cc @arjenmarkus https://github.com/arjenmarkus, @nshaffer https://github.com/nshaffer, @jvdp1 https://github.com/jvdp1
We need a few more voices to move this issue forward. Given how many polynomial fitting codes are out there, it looks like polynomial regression is quite popular among practitioners.
Do you prefer the simple interface or the complex one?
! simple subroutine polyfit(x,y,n,p) real(wp), intent(in) :: x(:) real(wp), intent(in) :: y(:) integer(ip), intent(in) :: n real(wp), intent(out) :: p(:) end subroutine ! complex
subroutine polyfit(x, y, deg, p, rcond, rank, singular_values) real(wp), intent(in) :: x(:), y(:) integer(ip), intent(in) :: deg real(wp), intent(out) :: p(:) real(wp), intent(in), optional :: rcond integer(wp), intent(out), optional :: rank real(wp), intent(out), allocatable, optional :: singular_values(:) end subroutine
— Reply to this email directly, view it on GitHub https://github.com/fortran-lang/stdlib/issues/601#issuecomment-1450175284, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAN6YRZGBXTZDOB2D7LKORLWZ5HENANCNFSM5KDTWO2A . You are receiving this because you were mentioned.Message ID: @.***>
Motivation
Functions for low-order polynomial fitting using (weighted) least squares regression are common in many scientific libraries.
Here's an interface for the non-weighted case, inspired by the numpy version cited below:
Prior Art
Prior art in popular scripting languages:
numpy.polyfit
polyfit
(MATLAB)Polynomials.polyfit
(Julia)lm
(R) (in R one would use a combination of lm for linear regression and poly as demonstrated here)Prior art in Fortran (in no particular order):
POLFIT
- A 704 program for polynomial least squares fitting (Fortran II)LSQFT
- A nonlinear least squares data fitting subroutine suitable for minicomputersdpolft
- Fit discrete data in a least squares sense by polynomials (related routines includedpcoef
anddp1vlu
)RCURV
(also seepoly_regression
in the C interface)e02adf
ande02aef
)spfit
(part of MATH77/mathc90 library, available on Netlib)LS_POLY
routine by Jean-Pierre Moreau (part of theLSQPLY.F90
file)Prior art in other programming languages
multifit
which may be used for any linear model)Additional Information
This will probably require LAPACK for the factorization of the Vandermonde matrix either using a rank-revealing QR or SVD factorization.
A function to evaluate the polynomial (and it's derivatives) using Horner's scheme (or alternatively, in some orthogonal form), should be added in parallel.
An advanced interface could consider using derived types for polynomials, similar to the numpy Polynomial sub-package.