PhilippPro / OOBCurve

Random Forest OOB Curves for any performance measure of mlr
9 stars 0 forks source link

Parallelize? #2

Open ck37 opened 6 years ago

ck37 commented 6 years ago

Hello,

Do you know if there's a way to parallelize the OOBCurve analysis? I am analyzing a large dataset and see that it's using only one core, but upon reviewing the source code and the relevant mlr methods I haven't noticed where I might specify a parallelization option or use future/parallelMap, etc.

Cheers, Chris

PhilippPro commented 6 years ago

Hi Chris,

there is not an option for parallelizing currently.

How many trees are you using? How many observations/classes do you have? Are you using OOBCurve or OOBCurvePars?

There are several things that could be parallelized:

I could include an option to parallelize this, but I have to think a bit about it.

ck37 commented 6 years ago

Thanks Philipp, for this particular project I'm using 3,000 trees and have 125k observations (2 classes), and am using OOBCurve. I actually hadn't noticed OOBCurvePars before, will have to try that one out.

PhilippPro commented 6 years ago

OOBCurvePars is just for finding ideal values for hyperparameters like mtry, so the purpose is quite different. ;)