joeddav / devol

Genetic neural architecture search with Keras
MIT License
948 stars 117 forks source link

possible contributions & question about population scoring method #13

Closed leaprovenzano closed 7 years ago

leaprovenzano commented 7 years ago

Hi This Library is a great idea and has a lot of potential for model selection! Thank you!

I have been adding a couple of enhancements to my fork. Essentially I just wanted to use validation loss as a metric and ended up adding lots of other stuff as well. I'm still thinking my revisions are messy and obviously contain some breaking changes but I'm happy to test and submit individual pull requests per change if any of the changes below are helpful:

  1. GenomeHandler is passed a metric default is "accuracy" which sets maximizing or minimizing objective for DEvol so we can optimize validation loss instead of accuracy if we choose. ( In actual fact this whole thing could be passed directly on init to DEvol if it was not referenced in the GenomeHandler in best_genome, but that would probably not make sense).

  2. I added a flatten_layers param to GenomeHandler so that we include the global max and average pooling in the search. the parameter name flatten_layers is horrible through, I am trying to come up with something better.

  3. Added an arg for DEvol to fit verbose, It was something I wanted and I noticed it was in the projects section. It just calls fit with verbose=1 and doesn't look so hot with the tqdm so maybe not what you're looking for.

  4. I added nadam, vanilla sgd and sgd with momenum, nesterov, & decay to optimizers. I'm not sure that this is really a nessesity since users could add themselves , but it's nice to have sgd especially to compare to newer (possibly less stable) optimizers.

  5. added Elu to list of activations. As above could be argued that it doesn't need to be a default (& mixed results with batchnorm so...)

I also just wanted to ask about the score method of Population class:


    def score(self, fitness):
        return (fitness * 100)**4

I was just a little confused as to why the values needed to be so large, where we are already min/max scaling fitnesses? I have left this out (using only scaling) and am not seeing a difference. I thought I must be missing something so I wanted to ask.

joeddav commented 7 years ago

Thanks for your feedback. Adding the option to optimize loss vs accuracy is something I've been wanting to do.

  1. As mentioned above, I've been wanting to do this, and PR would be welcome. I think it'd actually make most sense to pass this as a param to the run function as it has more to do with the genetic evolution, not the genomic representation, but I'd be curious to see your code to see how you did it from the genome handler. It'd also make sense to put this in the same PR with a revision of the scoring function as well.
  2. Great. Just follow the model of including it as a default param that the user can choose to include or exclude, and I think that's a great addition.
  3. Yeah, this is what I struggled with and that's why I didn't include that flag. I would like some way to show training of individual models in a clean way that also allows for an overall progress bar (tqdm not essential), but just throwing in a verbose=1 is very messy. Let's keep that out until we come up with a better way to do it.
  4. That's fine. Optimizers aren't expensive territory to explore, so that'd be fine to add if you'd like. As you mentioned, though, the user can also add these if they'd like. What might be more useful is a param for learning rates on these optimizers.
  5. Same as above. The more useful thing eventually would be adding advanced optimizers like LeakyReLU, etc., but that's a touch more work as it requires additional parameters. Feel free to PR yours as is.
  6. This is the ugliest part of our code, and something that's only in place because that worked best on our models when testing with MNIST which got ultra high accuracies. It allowed our model to treat 99.4% as significantly different than 99.2%, but should not have been left in there like that. That's why I added the option for a custom scoring function, but I would definitely change that to something more reasonable as a default. PR please!
leaprovenzano commented 7 years ago

ok so I have created a pull request for parts 1 and 6 . So only changes to DEvol and Population.
Let me know if there are any problems or you want anything changed.

Also ...

I have changed the GenomeHandler on my fork to use the functional api, and a couple build functions I commonly use for blocks and layers. This allows me to use PReLU and Leaky with their default params (for now) and makes decode shorter and (for me ) much easier to read. plus it would be quite easy to extend to using blocks of x convs, factorized convolutions etc.

Shall I include this in another request?

joeddav commented 7 years ago

As I mentioned in PR #13, I think switching to the functional API is a good idea. That'd make another good PR.

leaprovenzano commented 7 years ago

cool i will close this now and take up in future PRs...

wenbostar commented 6 years ago

@leaprovenzano, @joeddav, it looks like the functional API is really useful. May I ask when the functional API will be available?

joeddav commented 6 years ago

@wenbostar what exactly is it you're trying to do? @leaprovenzano was referring to changing the underlying code to utilize the functional API, but that wouldn't affect you from a user perspective because Keras' API is all abstracted away anyway.

wenbostar commented 6 years ago

@joeddav , I want to try some new activation functions, such as Leaky Relu.

joeddav commented 6 years ago

Yeah, unfortunately with the way Keras' "advanced activations" work there's not really an easy way to encode them into the genome with the current setup.