mmahmoudian / sivs

An iterative feature selection method that internally utilizes varius Machine Learning methods that have embeded feature reduction in order to shrink down the feature space into a small and yet robust set.
GNU General Public License v3.0
4 stars 1 forks source link

Warning is produced when running in parallel #7

Closed mmahmoudian closed 1 year ago

mmahmoudian commented 1 year ago

The following warning is thrown when the code is run in parallel.

1: In `[<-.data.frame`(`*tmp*`, , tmp.columns.to.zscore, value = list( :
  provided 114 variables to replace 25 variables

The issue is that in some unlucky cases there are only one column left while we are binning the columns to send to each CPU core. In those cases, selection of columns from a matrix object does not return a table but rather a numeric vector. This means that on the line 677 of the following chuck, we are sending a vector to our func.zscore function rather than a matrix.

https://github.com/mmahmoudian/sivs/blob/7e629e2212f96106b6b5fdec2eca07e255a03840/R/sivs.R#L665-L678

The x[, tmp.bins[[i]]] in line 677 of the chunk above is equivalent of:

base::getElement(object = x, name = tmp.bins[[i]])

The solution is pretty obvious, using base::subset(), although I think it will add some overhead on the runtime of the code. But because it is run only once per each SIVS run, it is pretty okay. The fix will come in PR #8.