Closed alejandro-ariza closed 4 years ago
Well, compute_penalty_matrix
its more like an auxiliary function, that can combine penalty matrices for several covariates in one penalty matrix, useful for regression. I am not even sure if it should be private.
To obtain the penalty matrix for one basis, first you need to know which kind of penalization you want. For now, we have only Tikhonov Regularization and L2 Regularization (which is an special case of the former). The penalty_matrix
method of the regularization object is the method that you have to call.
In Tikhonov regularization you have to choose a linear operator which will be penalized. The fda
library only have linear differential operators, so I assume that is what you want.
As an example, suppose that you want to compute the penalty matrix corresponding to the second derivative operator of a Fourier basis. You can then do:
from skfda.misc.regularization import TikhonovRegularization
from skfda.misc.operators import LinearDifferentialOperator
from skfda.representation.basis import Fourier
basis = Fourier(n_basis=3)
regularization = TikhonovRegularization(LinearDifferentialOperator(2))
regularization.penalty_matrix(basis)
Sorry if all of this is a bit more complicated than in fda
, but it was done to generalize to arbitrary linear operators, and to allow penalization for other objects, like numpy arrays and discretized functions.
Thanks @vnmabus!
I was definitely looking at the wrong method, I guess I'm struggling with some terminology differences between fda
in R and scikit-fda
, so I got lost. I appreciate your intuition about what I wanted to achieve by looking at fda::eval.penalty
. Sorry, that's was my best chance to explain it, since I'm not familiar with the maths behind functional data analysis.
So, I have compared my routine in R and the one I'm translating into Python, and I can confirm that these 2 operations return exactly the same penalty matrix:
# R
penalty = fda::eval.penalty(basis)
# Python
regularization = TikhonovRegularization(LinearDifferentialOperator(0))
penalty = regularization.penalty_matrix(basis)
The default linear differential operator in fda::eval.penalty
is cero, this is why is not explicitly added.
Thanks again, this allows me to continue translating my routine from R to Python, I was stuck.
One last point, LinearDifferentialOperator(0)
is the same as the identity operator. And the Tikhonov regularization with the identity operator is the same as the L2 regularization. Although probably you still want to use the linear differential operator for now, because it is faster right now (I have yet to optimize the identity operator).
Good to know, thanks!
Hey @vnmabus , Im trying to code up something similar, but instead of fda::eval.penalty
i'll be using fda::bsplinepen
.
library(fda)
knots = seq(from = 0, to = 1, length = k - degree + 1);
nknots = length(knots);
basisobj = fda::create.bspline.basis(
c(0, knots[nknots-1]), norder = degree + 1,
nbasis = k - 1, breaks = knots[-nknots]);
P = fda::bsplinepen(basisobj, Lfdobj = diffMatrixOrder)
TBH im not that familiar with either the R fda package, or this Python fda package.
Ill try the same suggestion you provided above, but was wondering if you have any suggestions before I proceed trying to code up this snippet in python using your TikhonovRegularization.penalty_matrix
!
Thanks :)
Apparently fda::eval.penalty
is just a wrapper that calls a more specific function (such as fda::bsplinepen
) depending on the basis type. Thus, the suggested Python snippet should still work for you.
Note that in recent versions of the package the names Fourier
and TikhonovRegularization
are deprecated (but still work) and the names FourierBasis
(or, in your case, BSplineBasis
) and L2Regularization
should be used instead in new code.
Thanks for the pointers! I think I'm close, but maybe I have misunderstood how to set up the L2Regularization
object...
The penalty matrix I get using the R
vs Python
FDA packages for a set of equally spaced knots:
I think the crux of the difference may lie in the LinearDifferentialOperator(2)
vs Lfdobj = 2
:
operator = skfda.misc.operators.LinearDifferentialOperator(2)
regularization = skfda.misc.regularization.L2Regularization(operator)
p = regularization.penalty_matrix(basis)
VS
P = fda::bsplinepen(basisobj, Lfdobj = 2)
NOTE: this does require `rpy2` to be installed ```python import rpy2.robjects as robjects from rpy2.robjects.packages import importr import numpy as np import matplotlib.pyplot as plt import skfda from matplotlib.colors import TwoSlopeNorm def get_r_penalty_matrix(k, degree, diffMatrixOrder): """ library(fda) knots = seq(from = 0, to = 1, length = k - degree + 1); nknots = length(knots); basisobj = fda::create.bspline.basis( c(0, knots[nknots-1]), norder = degree + 1, nbasis = k - 1, breaks = knots[-nknots]); P = fda::bsplinepen(basisobj, Lfdobj = diffMatrixOrder) """ importr('fda') robjects.r(f'k <- {k}') robjects.r(f'degree <- {degree}') robjects.r(f'diffMatrixOrder <- {diffMatrixOrder}') robjects.r('knots <- seq(from = 0, to = 1, length = k - degree + 1)') robjects.r('nknots <- length(knots)') robjects.r( 'basisobj <- fda::create.bspline.basis(c(0, knots[nknots-1]), norder = degree + 1, nbasis = k - 1, breaks = knots[-nknots])') robjects.r("plot(basisobj, xlab='t', ylab='Bspline basis functions B(t)', las=1, lwd=2)") robjects.r("savePlot('r_basisobj.png', type='png')") robjects.r('P <- fda::bsplinepen(basisobj, Lfdobj = diffMatrixOrder)') p = robjects.r['P'] return p def get_py_penalty_matrix(k, degree, diffMatrixOrder): knots = np.linspace(0, 1, k - degree + 1) basis = skfda.representation.basis.BSplineBasis(knots=knots, order=degree) basis.plot() plt.savefig('python_basisobj.png') operator = skfda.misc.operators.LinearDifferentialOperator(diffMatrixOrder) regularization = skfda.misc.regularization.L2Regularization(operator) p = regularization.penalty_matrix(basis) return p if __name__ == '__main__': k = 30 degree = 3 diffMatrixOrder = 2 r_p = get_r_penalty_matrix(k, degree, diffMatrixOrder) py_p = get_py_penalty_matrix(k, degree, diffMatrixOrder) fig, ax = plt.subplots(1, 2) for i, (label, matrix) in enumerate(zip(('r', 'python'), (r_p, py_p))): matrix = matrix / np.max(matrix) ax[i].set_title(label) norm = TwoSlopeNorm(vmin=matrix.min(), vcenter=0, vmax=matrix.max()) im = ax[i].pcolor(matrix, ec='tab:gray', lw=0.005, cmap='bwr', norm=norm, antialiased=False) fig.colorbar(im, ax=ax[i], orientation='horizontal') plt.tight_layout() plt.savefig('penalty_matrix.png') ```
Do you have any suggestions on what I can change in my py
code to make it match the r
penalty matrix?
Thanks again :D
I think the problem is in order=degree
. It should be order=degree+1
, just as in the R case. Changing that both plots are similar in my computer.
Thanks! For future reference, I had to make one additional change to the py code to get the same number of basis elements (num knots from k - degree+1
--> k - degree
). Otherwise, the size of the penalty matrix won't match up.
knots = np.linspace(0, 1, k - degree )
basis = skfda.representation.basis.BSplineBasis(knots=knots, order=degree+1)
```python import rpy2.robjects as robjects from rpy2.robjects.packages import importr import numpy as np import matplotlib.pyplot as plt import skfda from matplotlib.colors import TwoSlopeNorm def get_r_penalty_matrix(k, degree, diffMatrixOrder): """ library(fda) knots = seq(from = 0, to = 1, length = k - degree + 1); nknots = length(knots); basisobj = fda::create.bspline.basis( c(0, knots[nknots-1]), norder = degree + 1, nbasis = k - 1, breaks = knots[-nknots]); P = fda::bsplinepen(basisobj, Lfdobj = diffMatrixOrder) """ importr('fda') robjects.r(f'k <- {k}') robjects.r(f'degree <- {degree}') robjects.r(f'diffMatrixOrder <- {diffMatrixOrder}') robjects.r('knots <- seq(from = 0, to = 1, length = k - degree + 1)') robjects.r('nknots <- length(knots)') robjects.r( 'basisobj <- fda::create.bspline.basis(c(0, knots[nknots-1]), norder = degree + 1, nbasis = k - 1, breaks = knots[-nknots])') robjects.r("plot(basisobj, knots=FALSE, las=1, lwd=2, main=sprintf('Num basis = %d' ,basisobj$nbasis))") robjects.r("savePlot('r_basisobj.png', type='png')") robjects.r('P <- fda::bsplinepen(basisobj, Lfdobj = 1)') p = robjects.r['P'] return p def get_py_penalty_matrix(k, degree, diffMatrixOrder): knots = np.linspace(0, 1, k - degree ) basis = skfda.representation.basis.BSplineBasis(knots=knots, order=degree+1) basis.plot() plt.gcf().suptitle(f'Num basis = {len(basis)}') plt.savefig('python_basisobj.png') operator = skfda.misc.operators.LinearDifferentialOperator(diffMatrixOrder) regularization = skfda.misc.regularization.L2Regularization(operator) p = regularization.penalty_matrix(basis) return p if __name__ == '__main__': k = 10 degree = 3 diffMatrixOrder = 1 r_p = get_r_penalty_matrix(k, degree, diffMatrixOrder) py_p = get_py_penalty_matrix(k, degree, diffMatrixOrder) fig, ax = plt.subplots(1, 2) for i, (label, matrix) in enumerate(zip(('r', 'python'), (r_p, py_p))): matrix = matrix / np.max(matrix) ax[i].set_title(label) vals = sorted([matrix.min(), 0, matrix.max()]) # if vals are unique, then vcenter is 0 if len(set(vals)) == 3: norm = TwoSlopeNorm(vmin=vals[0], vcenter=vals[1], vmax=vals[2]) else: norm = TwoSlopeNorm(vmin=vals[0]-1, vcenter=0, vmax=vals[2]) im = ax[i].pcolor(matrix, ec='tab:gray', lw=0.005, cmap='bwr', norm=norm, antialiased=False) fig.colorbar(im, ax=ax[i], orientation='horizontal') plt.tight_layout() plt.savefig('penalty_matrix.png') ```
Hi there,
It seems compute_penalty_matrix is the equivalent of fda::eval.penalty in R?
Similarly to the R package, I'm passing a basis type object (bspline) in compute_penalty_matrix, but there are 2 new arguments unknown for me:
regularization_parameter
andregularization
, so I'm struggling to use it. I would really appreciate if someone could provide more details in the docstring to use this function or provide an usage example, please.Apologies in advance if there is something obvious I'm missing!
Thanks.