[x] task_0_generate_features.sl -> output is the feature.csv (has all the features necessary)
[x] task_1_find_best_classifier.sl (this will run the gridsearch in parallele) -> output is a pickled best clf
[x] bring in the data generated from task_0 on my local machine to prototype
[x] import ml_tools inside the project
[x] setup the loading and the filtering of the dataset properly depending on parameters
[x] setup the gridsearch properly with the clf mentioned in the abstract
[x] capture the best parameters and the best clf and save it
[x] make the script iterate on each 20 classification to do {2 for type, 2 for epochs, 5 for parameters grouping}. See how to run 20 different tasks on 20 different nodes each having 40 cores over here on stackoverflow
[x] task_0_generate_features.sl -> output is the feature.csv (has all the features necessary)
[x] task_1_find_best_classifier.sl (this will run the gridsearch in parallele) -> output is a pickled best clf
[x] bring in the data generated from task_0 on my local machine to prototype
[x] import ml_tools inside the project
[x] setup the loading and the filtering of the dataset properly depending on parameters
[x] setup the gridsearch properly with the clf mentioned in the abstract
[x] capture the best parameters and the best clf and save it
[x] make the script iterate on each 20 classification to do {2 for type, 2 for epochs, 5 for parameters grouping}. See how to run 20 different tasks on 20 different nodes each having 40 cores over here on stackoverflow
For these checkout : joblib
Not too bad I don't really need to have that much cores, it doesn't take very long to run 1000 iterations on 40 cores