khliland / pls

The pls R package
36 stars 3 forks source link

A question about PLS Scores #26

Closed cjgs1993 closed 4 years ago

cjgs1993 commented 4 years ago

Hello,

I understand with PLS the scores come from the deflated X matrix. However, when I try to calculate scores manually, they are not the same as the output from the scores(model) function from the pls package. I have been calculating scores from t1 = X1w1, t2 = (X1 - X1w1p1)w2, etc. I have been able to confirm t2 = (X1 - X1w1p1)w2 = Xw*2, however I have not been able to end up with scores that are equal to the ones given by scores(model) in R. What am I missing or doing wrong? Thanks!

khliland commented 4 years ago

Hi cjgs1993. If you exchange your loading weights (W) and loadings (P) with projection vectors (R), you can calculate scores directly as T = X%*%R (given centred X). For example:

library(pls)
data(gasoline)
mod <- plsr(octane~NIR, ncomp=5, data=gasoline)
W <- mod$loading.weights
P <- mod$loadings
WPW <- W %*% solve(crossprod(P, W)) # Projections, post-calc.
R <- mod$projection                 # Also projections
head(WPW)
head(R)

X <- gasoline$NIR
X <- X - rep(colMeans(X), each=nrow(X))
T <- X%*%R
head(scores(mod))
head(T)

Best regards, Kristian

cjgs1993 commented 4 years ago

Oh I see where I went wrong, I was mean centering AND stdev scaling my X to do T = X%*%R. Is there a reason why the scaling is not performed, or the scores returned by scores(mod), are not scaled even though center = TRUE and scaling = TRUE were specified in plsr()?

Thanks!

Christian

khliland commented 4 years ago

Hi Christian,

Scaling is active if you use scale = TRUE (double-check spelling). I extended the example from above to show the effect:

library(pls)
data(gasoline)
mod <- plsr(octane~NIR, ncomp=5, data=gasoline, scale=TRUE)
W <- mod$loading.weights
P <- mod$loadings
WPW <- W %*% solve(crossprod(P, W)) # Projections, post-calc.
R <- mod$projection                 # Also projections
head(WPW)
head(R)

X <- gasoline$NIR
Xscaled <- scale(X)
Tscaled <- Xscaled%*%R
X <- X - rep(colMeans(X), each=nrow(X))
T <- X%*%R
head(scores(mod))
head(Tscaled)
head(T) # Wrong with unscaled X, when model used scale=TRUE

Regards, Kristian

cjgs1993 commented 4 years ago

Thank you for the clarification!

Cheers, Christian