Closed DobraVila closed 5 months ago
Basically, X[ind.train, ]
is used for training, and X[ind.test, ]
for testing.
But, because X
is a data on disk that we don't want to subset, I'm telling big_spLinReg()
to use only ind.train = ind.train
(subset of rows that are accessed from X
), and predict()
to use only ind.test
.
A follow-up from the email.
The original question was:
X <- FBM(N, M, init = rnorm(N* M, sd = 5))
mod <- big_spLinReg(X, y[ind.train], ind.train = ind.train, K = 4)
pred <- predict(mod, X, ind.test)
I understand the part that the original data is split into train and test, but I don't understand why
X
is required for both model building and model testing. Could you please clarify this a bit more?