khliland / pls

The pls R package
36 stars 3 forks source link

Multiple Y variables #19

Closed arnabchakrabarty closed 5 years ago

arnabchakrabarty commented 5 years ago

More of a question than an issue. I am unable to figure out a way to use multiple Y variables and develop a PLS model for later predictions with new sets of X. The data file (header) is of the following format: X1 X2 X3 ....Y1 Y2 Y3 and so on. I have been trying to use mvr and cppls, but didn't work so far.

bhmevik commented 5 years ago

The best way to do that is to combine the responses (Y1, Y2, Y3) into a matrix in the data frame, and use that as the response variable. (It is also a good idea to combine the predictors (X1, X2, ...) into a matrix and use that instead of the using the individual variables.) There are several ways to do this, and this is described in the vignette that comes with the package; do vignette('pls-manual') and see the section about formulas and data frames. In your case (assuming your data is in a data frame "mydata", and there are n X variables), something like this should work:

X <- as.matrix(mydata[,1:n])
Y <- as.matrix(mydata[,-(1:n)])
mydata2 <- data.frame(X = X, Y = Y)
plsr(Y ~ X, data = mydata2) # plus other arguments