Wavelets: periodicity / symmetry constraints cause boundary artefacts

fabian-s commented 4 years ago

cf. #88 :

can this be avoided / remedied somehow, e.g. by padding the data at the ends of the domain?

SvenLorenz commented 4 years ago

## Computes straight line between first and last point and then subtracts that line from the original data
remove_slope <- function(x, y) {
  last_element <- length(x)
  slope <- (y[last_element] - y[1]) / (x[last_element] - x[1])
  intercept <- x[1] - slope * x[1]
  f <- intercept + slope * x
  y_desloped <- y - f
  y_desloped <- structure(y_desloped,
                          slope = f,
                          original_series = y)
  y_desloped
}

## Test with removing the slope
f <- function(t) 2*t + dbeta(t, 5, 7) * ifelse(t < .5, -1, 1) - .2 * dnorm(t, 0.2, 0.05)

t <- seq(0, 1, l = 2^10)
y <- f(t) + rnorm(length(t), sd = .5)

y_t <- remove_slope(1:1024, y)

Z <- cbind(1, ZDaub(t))
coef_qr <- qr.coef(qr(Z), y_t)
## Compute directly with the estimation
yhat <- cbind(Z, attr(y_t, "slope")) %*% c(coef_qr, 1)

## Add afterwards
yhat_1 <- Z %*% coef_qr
y_corrected <- yhat_1 + attr(y_t, "slope")

all.equal(yhat, y_corrected) ##TRUE

plot(y, pch = 20, cex = .1)
lines(yhat, col = "red", lwd = 2)

removed slope

This should work. I'm going to implement the first solution since it works best with tf_evaluate.

SvenLorenz commented 4 years ago

set.seed(1234)
f <- function(t) 2*t + dbeta(t, 5, 7) * ifelse(t < .5, -1, 1) - .2 * dnorm(t, 0.2, 0.05)

t <- seq(0, 1, l = 2^10)
y <- f(t) + rnorm(length(t), sd = .5)

y_t <- remove_slope(1:1024, y)

Z <- cbind(1, ZDaub(1:2^10, numLevels = 6), attr(y_t, "slope"))

coef_qr <- qr.coef(qr(Z), y_t)
yhat <- Z %*% coef_qr

plot(y, pch = 20, cex = .1)
lines(yhat, col = "orange", lwd = 2)

Estimating coefficients for the slope doesn't seem to work as well as just subtracting and adding later.

fabian-s commented 4 years ago

@SvenLorenz naah, it works fine, but you need
coef_qr <- qr.coef(qr(Z), y),
not
coef_qr <- qr.coef(qr(Z), y_t):

devtools::load_all(".")

set.seed(1234)
f <- function(t) 2*t + dbeta(t, 5, 7) * ifelse(t < .5, -1, 1) - .2 * dnorm(t, 0.2, 0.05)

t <- seq(0, 1, l = 2^10)
y <- f(t) + rnorm(length(t), sd = .5)

y_t <- remove_slope(1:1024, y)

Z <- cbind(1, ZDaub(1:2^10, numLevels = 6), attr(y_t, "slope"))

coef_qr_wrong <- qr.coef(qr(Z), y_t)
yhat_wrong <- Z %*% coef_qr_wrong

coef_qr <- qr.coef(qr(Z), y)
yhat <- Z %*% coef_qr

plot(y, pch = 20, cex = .1)
lines(yhat_wrong, col = "orange", lwd = 2)
lines(yhat, col = "red", lwd = 2)

fabian-s commented 4 years ago

PS: cv.glmnet seems to work nicely for thresholding (here at least):

# LASSO needs standardized design matrix
# (or use glmnet's penalty.factor arg....)
Z_lasso <- scale(cbind(ZDaub(1:2^10, numLevels = 6), t), center = FALSE)
coef_lasso <- coef(glmnet::cv.glmnet(Z_lasso, y), nlambda = 20)  #default is 100
yhat_lasso <- cbind(1, Z_lasso) %*% coef_lasso

plot(y, pch = 20, cex = .1)
lines(f(t), col = "darkgray", lwd = 4)
lines(yhat_wrong, col = "orange", lwd = 2)
lines(yhat, col = "red", lwd = 2)
lines(yhat_lasso, col = "cyan", lwd = 2)
lines(tfb(y, arg = 1:1024))

tidyfun / tidyfun

Wavelets: periodicity / symmetry constraints cause boundary artefacts #92