Question from @MelvinDunn that I'm documenting here:
Had one question while I was looking at the ol' machina:
-How does this machine determine the starting points for hyperparams?
-How does it determine the validation size? (Couldn't find it)
Sorry, I was interested, and while I know I could easily just look at the
code myself, I thought you would know off the top of your head.
I'm extremely interested in AutoML, and I think this machine is, well,
wonderful.
Thanks again,
Melvin
My response:
I love curiosity- thanks for continuing to ask questions!
We use RandomizedSearchCV to find the optimal hyperparameters. It picks parameters randomly from the distributions we give. Those distributions can be found in pySetup/parameterMakers.
Right now the validation size is just hard-coded in. It's a pretty large split. I've messed around with different values, but I want to say it's somewhere around 20-40% depending on the size of the input data. The exception to this is data like Numer.ai that has a specific validationSplit column, that must be specified in the dataDescription row (where we specify what type of data each column holds). Then we just use that validation split.
Question from @MelvinDunn that I'm documenting here:
Had one question while I was looking at the ol' machina:
-How does this machine determine the starting points for hyperparams? -How does it determine the validation size? (Couldn't find it)
Sorry, I was interested, and while I know I could easily just look at the code myself, I thought you would know off the top of your head.
I'm extremely interested in AutoML, and I think this machine is, well, wonderful.
Thanks again,
Melvin
My response: I love curiosity- thanks for continuing to ask questions!
Keep the questions coming!