Closed wordsmith189 closed 5 years ago
Hi, wordsmith189,
I guess there are only 2 variables (octane, NIR) in the dataset, gasoline or gasTrain.
> str(gasoline) 'data.frame': 60 obs. of 2 variables: $ octane: num 85.3 85.2 88.5 83.4 87.9 ... $ NIR : 'AsIs' num [1:60, 1:401] -0.0502 -0.0442 -0.0469 -0.0467 -0.0509 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr "1" "2" "3" "4" ... .. ..$ : chr "900 nm" "902 nm" "904 nm" "906 nm" ... >
There may be no difference between
(octane ~ NIR, data=gasTrain, ...)
and
(octane ~ ., data=gasTrain, ...)
I may be wrong. Please correct if my guess is incorrect,
Thanks in advanced.
You are absolutely right. gasTrain
is simply the first 50 rows of the gasoline
data set, and that only has two variables, octane
and NIR
. In R, formulas like something ~ .
is a shortcut for something ~ all + the + other + variables
, so in this case, octane ~ .
is equivalent to octane ~ NIR
.
Got it. Thank you both!
In your JStatSoft article about
pls
, you show model calls like this onewhere the dependent variable
octane
seems to be explained by one of the independent variables,NIR
. However, some of the online tutorials forpls
that are available don't state an independent - they would simply writeCan you explain what the difference is, or rather: what the effect of making one of the independents explicit in the model call is?