zachmayer / caretEnsemble

caret models all the way down :turtle:
Other
226 stars 75 forks source link

Running svmRadial in parallel in caretStack? #202

Closed sparcycram closed 8 years ago

sparcycram commented 8 years ago

svmRadial appears not to run in parallel in caretStack. When observing the resource monitor in windows svmRadial appears to only use 1 CPU even with allowParallel = TRUE. The other models in caretStack pam, parRF, dnn are running on multiple cpu's.

Is this a known issue?

Thanks

zachmayer commented 8 years ago

allowParallel = TRUE is an argument to caret::train, so this is really a question for the caret github page.

pam and dnn both use your blas, so if you have openblas, accelerate, or another optimized blas library, those algorithms will run in parallel regardless of whether you set allowParallel = TRUE or allowParallel = FALSE. Similarly, parRF creates (and then destroys) it's own parallel cluster.

svmRadial, on the other hand, is inherently serial. There's no way to parallelize it.

allowParallel therefore doesn't effect the base model at all. It just controls whether the caret-based grid search and cross-validation is done serially or in parallel. If you wish to run in parallel, you will need to install the doParallel package, and read the directions for how to use it. Once you know what you're doing, you can start a parallel cluster, run caret or caretEnsemble code with allowParallel=TRUE, and do grid search in parallel.