l-ramirez-lopez / resemble

resemble is an R package which implements functions dedicated to non-linear modelling of complex spectroscopy data
Other
20 stars 14 forks source link

Spiking issues #18

Closed wartini-ng closed 4 years ago

wartini-ng commented 4 years ago

Hi,

We were trying to do some spiking with the updated MBL algorithm, and we got some error, but we were not quite sure. We were hoping you could assist or give some comments?

Thank you.

Snippets of code below:

library(resemble)
library(prospectr)
library(magrittr)

data(NIRsoil)
# training dataset
training  <- NIRsoil[NIRsoil$train == 1, ]
# testing dataset
testing  <- NIRsoil[NIRsoil$train == 0, ]

dth_pc = seq(0.05, 1, by =0.1)
k_min_max <- c(21, 150)
my_waplsr <- local_fit_wapls(min_pls_c = 1, max_pls_c = 20)
nnv_val_control <- mbl_control(validation_type = "NNv")

# Create the MBL model
mbl.model <- mbl(Xr = training$spc[!is.na(training$CEC),], Yr = training$CEC[!is.na(training$CEC)],
                 Xu = training$spc[is.na(training$CEC),], 
                 k_diss = dth_pc,
                 k_range = k_min_max,
                 spike = 1:50,
                 method = my_waplsr,
                 diss_method = "pca",
                 diss_usage = "predictors",
                 control = nnv_val_control,
                 scale = TRUE);

Error in { : task 3 failed - "subscript out of bounds"
l-ramirez-lopez commented 4 years ago

The error occurs because you are forcing 50 observations in your neighborhoods but at the same time you are requesting the function to retrieve a minimum of 21 neighbors. The minimum required amount of neighbors must be larger than the number of spiking observations. I will add a sanity check to warn users about this wrong input arguments. The following code should work:

library(resemble)
library(prospectr)
library(magrittr)

data(NIRsoil)
# training dataset
training  <- NIRsoil[NIRsoil$train == 1, ]
# testing dataset
testing  <- NIRsoil[NIRsoil$train == 0, ]

dth_pc <- seq(0.05, 1, by = 0.1)
k_min_max <- c(51, 150)
my_waplsr <- local_fit_wapls(min_pls_c = 1, max_pls_c = 20)
nnv_val_control <- mbl_control(validation_type = "NNv")

# Create the MBL model
mbl.model <- mbl(Xr = training$spc[!is.na(training$CEC),], Yr = training$CEC[!is.na(training$CEC)],
                 Xu = training$spc[is.na(training$CEC),], 
                 k_diss = dth_pc,
                 k_range = k_min_max,
                 spike = 1:50,
                 method = my_waplsr,
                 diss_method = "pca",
                 diss_usage = "predictors",
                 control = nnv_val_control,
                 scale = TRUE)