wenjie2wang / splines2

Regression Spline Functions and Classes
https://wwenjie.org/splines2
GNU General Public License v3.0
42 stars 3 forks source link

Shouldn't integrals of M-Splines of order k return I-Splines of order k + 1? #6

Closed mihaiconstantin closed 3 years ago

mihaiconstantin commented 3 years ago

I sincerely hope you don't hate me for opening another issue, but I really like your package!

I want to give you some context as to why I am posting this.

In issues #1 and #4 I was very confused because I didn't understand how exactly to use these splines in the context of regression. My goal was to use I-Splines to fit monotone non-increasing or non-decreasing splines.

I found your package splines2 and the code of De Leeuw (2017) online and that helped to contrast and compare the approaches. I learned that you are constructing I-Splines based on Ramsay (1988) who uses M-Splines as a starting point. On the other side, De Leeuw (2017) constructs I-Splines using N-Splines as a starting point. Furthermore, both approaches are not approximate, but equivalent. So, I was confused when using the same configuration of knots I got different I-Splines.

It turns out that both Ramsay (1988) and De Leeuw (2017) use in fact B-Splines, but normalized differently. Ramsay (1988) uses B-Splines that are normalized such that they integrate to one (i.e., these are called M-Splines). Whereas for De Leeuw (2017), the B-Splines are normalized such that they sum to one (i.e., which he calls normalized B-Splines, or N-Splines). Also, Ramsay (1988) refers to N-Splines simply as B-Splines. This seems to also be the case with splines2::bSpline() which actually returns normalized B-Splines that sum to one, i.e., N-Splines.

Indeed, normalized B-Splines and M-Splines look the same, and we can see that they only differ with respect to the normalization.

# The data.
x = 1:10

# The inner knots.
innerknots = c(5)

# The degree.
degree = 2

# Creating normalized `B-Splines` and `M-Splines`.
b <- splines2::bSpline(x, knots = innerknots, degree = degree, intercept = TRUE)
m <- splines2::mSpline(x, knots = innerknots, degree = degree, intercept = TRUE)

# Plotting the spline bases.
matplot(b, type = "l", lwd = 2, main = paste0("Normalized B-Splines | degree = ", degree))
matplot(m, type = "l", lwd = 2, main = paste0("M-Splines | degree = ", degree))

image

Now, onto creating I-Splines, Ramsay (1988) says that they can be created by taking the integrals of M-Splines. Which is what the argument integral in splines2::mSpline() does, or what the function splines2::iSpline() provides.

# Taking integrals of `M-Splines` and creating `I-Splines`.
mi <- splines2::mSpline(x, knots = innerknots, degree = degree, intercept = TRUE, integral = TRUE)
i <- splines2::iSpline(x, knots = innerknots, degree = degree, intercept = TRUE)

# Plotting the spline basis.
matplot(mi, type = "l", lwd = 2, main = paste0("Integrals of M-Splines | degree = ", degree))
matplot(i, type = "l", lwd = 2, main = paste0("I-Splines | degree = ", degree))

image

De Leeuw (1988; p. 17) refers to a formula (see below) from De Boor (1976) and says "This shows that I-splines can be computed by using cumulative sums of B-spline values.". To be more precise, I believe this should read as "sums of normalized B-spline values".

image

There are two things to note. First, De Leeuw (2017) uses m to refer to the order of a spline while Ramsay (1988) refers to the degree. If I am not mistaken the relationship should be degree = order - 1. Second, and most importantly, the equation above shows that integrals of M-Splines of order m are equivalent to cumulative sums of normalized B-Splines of order m + 1.

Indeed, it seems that by the equation above we can obtain the same I-Splines. So, above I used the splines2::iSpline() to create I-Splines of degree = 2. Thus, if I want to use cumulative sums to get the same I-Splines, I need normalized B-splines that are one degree higher, i.e., degree = 2 + 1.

# Create normalized `B-Splines` of one degree higher.
b2 <- splines2::bSpline(x, knots = innerknots, degree = degree + 1, intercept = TRUE)

# Now take the cumulative sums as De Leeuw (2017) shows in the paper. 
# And remove the redundant column of zeros at the end of the basis matrix. 
ci <- 1 - t(apply(b2, 1, cumsum))
ci <- ci[, -ncol(ci)]

# And, indeed, these are the same splines as those provided by `splines2::iSpline()`.
all(round(i, 10) == round(ci, 10))

# Plot the spline bases.
matplot(i, type = "l", lwd = 2, main = paste0("I-Splines | degree = ", degree))
matplot(ci, type = "l", lwd = 2, main = paste0("Cumulative sums of normalized B-Splines | degree = ", degree))

image

I was happy when I saw this equivalence. My confusion came from the fact that I stupidly overlooked the m + 1 part in the equation, and this sent me down the rabbit hole of chasing innocent intercepts...

Yet, I have one final (i.e., I hope...) question for you. Ramsay (1988; p. 428) says "We shall use the term order k to refer to an M-spline of degree k - 1 or the associated I-spline of degree k.". I try to replicate this with splines2, but I can't seem to get observe the same. For example, I expect to be able to obtain the I-Splines above (i.e., stored in i), constructed with degree = 2, by taking integrals of M-Splines constructed with degree = 2 - 1. But that doesn't seem to be the case.

# Create `M-Splines` of one degree lower.
mspline <- splines2::mSpline(x, knots = innerknots, degree = degree - 1, intercept = TRUE, integral = TRUE)

# Plotting the `M-Splines`.
matplot(mspline, type = "l", lwd = 2, main = paste0("Integrals of M-Splines | degree = ", degree))

# But the `I-Splines` with `degree = 2` look like:
matplot(i, type = "l", lwd = 2, main = paste0("I-Splines | degree = ", degree))

image

My best guess is that splines2::mSpline() takes this into account. Is this the case, or is my expectation unjustified to begin with?

Thank you for this incredibly useful package. How can I cite your work in my paper?

wenjie2wang commented 3 years ago

I sincerely hope you don't hate me for opening another issue, but I really like your package!

I do not mind if you open more issues. It has been my pleasure to discuss with you.

Thanks for providing the detailed context of the question. You did a great job of following De Leeuw (2017) and Ramsay (1988).

Indeed, Ramsay (1988; p. 428) suggested

We shall use the term order k to refer to an M-spline of degree k - 1 or the associated I-spline of degree k.

which ​enables us to compute the number of complete basis functions by (order + number of internal knots) consistently. This convention also means that the M-splines of order k corresponds to the I-splines of order k. For the M-splines of degree k, the corresponding I-spline basis functions should have a polynomial degree (k + 1). The relationship order = degree + 1 does not hold for I-splines by following this convention.

However, it is tricky to follow the exact convention in the function interface because we have the argument degree instead of order. Therefore, we define the degree in iSpline() to be the degree of the associated M-splines instead of the actual polynomial degree. (See the documentation of the argument degree for iSpline().) This revised convention allows us to compute the number of complete basis functions by (degree + 1 + number of internal knots) consistently.

How can I cite your work in my paper?

Glad to hear that you find {splines2} useful. You may cite it by citation("splines2"). In fact, we have been working on a manuscript to introduce this package in detail. We are targeting a journal submission by end of this month. I will update the inst/CITATION file once the paper is published.

mihaiconstantin commented 3 years ago

I do not mind if you open more issues. It has been my pleasure to discuss with you.

Thank you, I appreciate and the same goes for me.

Therefore, we define the degree in iSpline() to be the degree of the associated M-splines instead of the actual polynomial degree. (See the documentation of the argument degree for iSpline().)

Oh, okay, it makes perfect sense, and, indeed, it is quite clear from the documentation for iSpline():

The degree of I-spline defined to be the degree of the associated M-spline instead of actual polynomial degree. For example, I-spline basis of degree 2 is defined as the integral of associated M-spline basis of degree 2.

Thanks for suggesting how to cite the package. I look forward to reading the paper once is published. Good luck! I will close this now.