Open windowshopr opened 4 years ago
Now that I typed that out, I may have figured it out haha
Basically, I just re-worked the network.py
file to look like this:
import os
import sys
sys.path.append(os.path.dirname(os.path.realpath(__file__)))
import random
from train import train_and_score
class Network:
def __init__(self, nn_param_choice):
self.nn_param_choices = nn_param_choice
self.accuracy = 0
self.current_accuracy = []
self.network = {}
def create_random(self):
"""Create a random network."""
for key in self.nn_param_choices:
self.network[key] = random.choice(self.nn_param_choices[key])
def create_set(self, network):
"""
:param network dict: dictionary with network parameters
:return:
"""
self.network = network
def train(self, x_train, y_train, x_test, y_test):
self.accuracy = train_and_score(self.network, x_train, y_train, x_test, y_test)
self.current_accuracy.append(self.accuracy)
def average_the_training_scores(self):
# self.accuracy = train_and_score(self.network, x_train, y_train, x_test, y_test)
self.accuracy = sum(self.current_accuracy) / len(self.current_accuracy)
So basically, I keep the self.accuracy as a number, but just append it to a NEW list, that the average_the_training_scores() function uses to average, then just update the self.accuracy with that average and we're good to go!
Now you have a genetic algorithm that searches a DNN grid space, while also performing stratified kfold cross validation while training :) Thanks for letting me work that one out haha!
@windowshopr Cool! Thanks! Sorry I was offline for a few days. Let me find time to review and implement that.
In general, that was a 1-weekend project with training purposes in mind rather than production value. As you can see there are no tests there : ). So I would warn you to rely on that package in your production projects. You might take a look into AutoKeras for the similar functionality.
Cheers!
No prob, still love the script. I have worked with autoKeras and a few other AutoML packages, but some of them didn’t offer ALL of the hyperparams to be tuned, plus they were unstable when I tried to use them, they were out of date, etc. So I thought I’d take a stab at doing it all myself. Then I found your script and thought, “this is what I’m after right here!” Haha. Just added the CV to it, and that was basically it 👍 awesome work!
Love the script, and loved the article. This is exactly what I've been looking for as I manually made my own random grid search script using Keras a while ago, but have always wanted an implementation of a genetic algorithm to help steer the grid search in the right direction.
This is more of an upgrade/improvement request, and I will show you what I have so far to help, but really the only thing I want to add to your script, is to use a kfold cross validation while training, and after all folds have been trained, use the average accuracy score of all folds as the
self.accuracy
number to report when it's done.The way I envisioned this to work, would be to change the
_train_networks()
function inevolution.py
to something like this:You can see I've added
cv_folds
as another input to the function for the user to define it. Then it folds the training dataset appropriately, then just trains the same network across all folds. Then, I've added the functionnetwork.average_the_training_scores()
which is at the bottom of thenetwork.py
file. The new bottom of thenetwork.py
file looks like this:You'll see that I use the
.append()
function as an attempt to just add the current fold's score to a list. So at the top of thenetwork.py
, I also changedself.accuracy = 0
toself.accuracy = []
.This is what I have so far, but I know I'm not doing it correctly. When it runs now, it'll do 1 full generation (of 20 runs), but then when it goes to start the next run, I get:
So how could I potentially implement this into your code? I have a feeling I'm close, just need some guidance. Thanks for the awesome work!