Closed hengji-liu closed 6 years ago
Hi,
Grid search with BaselineOnly
can be done in the exact same way as other
algorithms, e.g.:
param_grid = {'bsl_options':[{'method': 'als'}, {'method': 'sgd'}]}
Nicolas
Hi, Sorry for the late reply. Suppose I'm using SGD and I want to do a cross-validation on reg, learning_rate and n_epochs. It looks like I have to enumerate these 3 parameters to form different bsl_options and put these bsl_options into param_grid. To illustrate,
param_grid = {'bsl_options': [{'reg': 0.1, 'learning_rate': 0.1, 'n_epochs': 100}, {'reg': 0.1, 'learning_rate': 0.1, 'n_epochs': 200}, {'reg': 0.1, 'learning_rate': 0.2, 'n_epochs': 100}, {'reg': 0.1, 'learning_rate': 0.2, 'n_epochs': 200},
the list goes on, just to enumerate the params manullay
]}
I feel the ideal way is
param_grid = {'reg': [0.1, 0.2], 'learning_rate': [0.1, 0.2], 'n_epochs': [100, 200]}
But apparently this won't work given the current design. It gets a bit more confusing when the predictor takes other parameters, in that case, some are in the option, some are just parameters of the predictor class. The same applies to KNN methods as well, where sim_options is used. My intention was just to check with you if I'm using the library in a wrong way, because it seems to me a bit troublesome and not intuitive to use the library in this way. But anyway, my current workaround is to generate the bsl_options first using extra code.
Ho indeed, I didn't think it through.
So basically with dictionary parameters with multiple keys, we currently have to enumerate all the combinations by hand. Looking at the current implementation of GridSearch, I can't think of an easy or clean way to overcome this. Would you have any suggestion? Also, could you please show me your current workaround?
Sorry for closing the issue and thanks for pointing that out! Nicolas
My current workaround is: (take knn as an example)
names = ('msd', 'cosine', 'pearson')
user_baseds = (True,)
min_supports = (1, 2, 3, 4, 5, 10, 15, 20, 25)
options = list()
# fill options with dictionaries
for name in names:
for user_based in user_baseds:
for min_support in min_supports:
d = dict()
d['name'] = name
d['user_based'] = user_based
d['min_support'] = min_support
options.append(d)
# make options a value of 'sim_options'
param_grid = {
'k': [4, 6, 8, 10, 12],
'min_k': [1, 2, 3],
'sim_options': options
}
Hey, I just pushed a fix for this. You can now use GridSearch in a more natural way as follows:
param_grid = {'k': [10, 20],
'sim_options': {'name': ['msd', 'cosine'],
'min_support': [1, 5],
'user_based': [False]}
}
I added this to the (latest) docs as a note.
Thanks for raising the issue! Nicolas
Hey, I just pushed a fix for this. You can now use GridSearch in a more natural way as follows:
param_grid = {'k': [10, 20], 'sim_options': {'name': ['msd', 'cosine'], 'min_support': [1, 5], 'user_based': [False]} }
I added this to the (latest) docs as a note.
Thanks for raising the issue! Nicolas
Thank you, @NicolasHug
The class BaselineOnly takes bsl_option as parameters instead of specific reg or learning_rate. How to perform a GridSearch on BaselineOnly model?