khliland / pls

The pls R package
36 stars 3 forks source link

CPPLS - questions about Y.add #10

Closed jnguofa closed 5 years ago

jnguofa commented 5 years ago

Hello,

I am very interested in using CPPLS because I'd like to more "aggressively" predict the Y predictor matrix.

However, I am not sure if and how best to use the Y.add.

In particular, I know that subject sex and data batch both have strong influences on my observations of X. However, I want to avoid any scores relating to these features. Subject sex and data batch are NOT related to my Y predictor in any way.

My questions are:

  1. Can I use Y.add to "avoid" extracting scores relating to Y.add variables?
  2. In the mayonnaise data set, is the Y.add variable coded as a dummy variable? Is that necessary for Y.add?

Many thanks again for your wonderful help,

khliland commented 5 years ago

Hi jnguofa,

  1. The Y.add opens up the search space that loading weights are produced from by creating candidate score vectors based on both the predictive Y and Y.add before canonical correlation is applied. The weights of the canonical correlation are used to produce final loading weights. As such I do not see any way that you can use Y.add to avoid relations, only to expand/improve.
  2. Y.add can take any shape or form (basically) as long as it is a single, named object of correct number of samples contained in the same data.frame or list as X and Y. So in practice Y.add can contain continuous column(s), dummy coded categorical columns, or combinations thereof which are related to X, Y, or Y through X in some way.