HunterMcGushion / hyperparameter_hunter

Easy hyperparameter optimization and automatic result saving across machine learning algorithms and libraries
MIT License
704 stars 100 forks source link

Support for various advanced functionality? #176

Closed ben-arnao closed 5 years ago

ben-arnao commented 5 years ago

A few question regarding functionality/support?

Thanks, and sorry if some of these have already been answered

HunterMcGushion commented 5 years ago

Thanks for your questions! I'm going to answer them in separate comments as I can. Sorry for the delay!

Regarding your last question (on optimizer lr), you can define lr inside one of Keras' optimizers classes for an Experiment.

So instead of the below line in examples/keras_examples/experiment_example.py: https://github.com/HunterMcGushion/hyperparameter_hunter/blob/ff81ca33e14735e2a4d8cd41738b5d6e6d430265/examples/keras_examples/experiment_example.py#L20

... we can import a Keras optimizer, and modify the above model.compile call like so:

from keras.optimizers import Adam

...  # Everything up to line 20, linked above

model.compile(
   optimizer=Adam(lr=0.01),
   loss="binary_crossentropy",
   metrics=["accuracy"],
)
return model

...  # Everything after `build_fn` definition

This definitely needs to be documented or added to a Keras example script, so thank you for bringing it up.

Also, it isn't yet possible to optimize parameters inside Keras optimizers defined like this, so Adam(lr=Real(0.0001, 0.1)) isn't working. A separate issue should be made to track progress on this, but I have some ideas if you (or anyone else) is interested in taking a shot at it with a PR.

... More answers to come...

HunterMcGushion commented 5 years ago

Regarding your first question (advanced activations), yes, they can be used. Using examples/keras_examples/experiment_example.py again, we can do the following:

from keras.layers.advanced_activations import LeakyReLU

Then, instead of https://github.com/HunterMcGushion/hyperparameter_hunter/blob/ff81ca33e14735e2a4d8cd41738b5d6e6d430265/examples/keras_examples/experiment_example.py#L13

use two separate lines, like so:

Dense(100, kernel_initializer="uniform", input_shape=input_shape),
LeakyReLU(),

One thing to be aware of here (noted in the first Keras question in the README's FAQs) is that if you start using separate activation layers like this, you'll want to be consistent even when using normal activations. Details can be found in the above-linked FAQ, but for optimization to correctly match with older Experiments, you would need to build all your models using separate activation layers, rather than the activation kwarg of Dense (or any other layer).

This is inconvenient, I know (sorry), and we should open up another issue to get this fixed up

HunterMcGushion commented 5 years ago

Could you expand on questions 2 and 4, and provide a reproducible example for each, please?

Regarding question 3, are you trying to use an OptPro to determine which columns of all your input columns you should apply StandardScaler to, for example?

ben-arnao commented 5 years ago

Thanks for your questions! I'm going to answer them in separate comments as I can. Sorry for the delay!

Regarding your last question (on optimizer lr), you can define lr inside one of Keras' optimizers classes for an Experiment.

So instead of the below line in examples/keras_examples/experiment_example.py: https://github.com/HunterMcGushion/hyperparameter_hunter/blob/ff81ca33e14735e2a4d8cd41738b5d6e6d430265/examples/keras_examples/experiment_example.py#L20

... we can import a Keras optimizer, and modify the above model.compile call like so:

from keras.optimizers import Adam

...  # Everything up to line 20, linked above

model.compile(
   optimizer=Adam(lr=0.01),
   loss="binary_crossentropy",
   metrics=["accuracy"],
)
return model

...  # Everything after `build_fn` definition

This definitely needs to be documented or added to a Keras example script, so thank you for bringing it up.

Also, it isn't yet possible to optimize parameters inside Keras optimizers defined like this, so Adam(lr=Real(0.0001, 0.1)) isn't working. A separate issue should be made to track progress on this, but I have some ideas if you (or anyone else) is interested in taking a shot at it with a PR.

... More answers to come...

Hi, thanks for the response.

In regards to your comment on learn rate and custom activation functions, now that i am thinking about it a way that i've gotten by similar limitations in my own custom hyper parameter optimizer is to use a wrapper functions, where the appropriate layer is returned by string key. So for example

def actv_custom_wrapper_func(activation):
    if activation == 'lrelu':
        return LeakyReLU()
    if activation == 'relu':
        return Activation('relu')

model.add(actv_custom_wrapper_func(Categorical(['relu', 'lrelu'])))

And i think this should solve my issue. I think i could use a similar method to get variable optimizer and learn rate. I would just need my wrapper function to take two arguments.

model.add(opt_custom_wrapper_func(Categorical(['adam', 'nadam', 'adamax']), Real(0.1, 0.0001)))

I can test this out and see if it works but before i do, is there a better way to do this?

As for the layers question, if i do something like the following

for x in range(Integer(1, 5)):
    # add layer ie.
    model.add(Dense(1))

I get an error which i assume to be caused by an inconsistent number of parameters on different runs which is understandable. Is there a way to have the # of layers as a variable? Or is it incompatible with this type of parameter optimization.

For the error about params on callbacks, i think any time you use a float for a callback param you get an error. For example:


model_extra_params=dict(
    callbacks=[ReduceLROnPlateau(monitor='loss', patience=Integer(5, 25), **min_delta=Real(0.01, 0.0001)**, verbose=0)]
)

I believe that inside the ReduceLROnPlateau callback function there is comparison operators for min_delta like greater than that causes an error when it tries to compare a Real object to a float.

Lastly, to your follow up on question 3, yes, i am trying to find out if there is a good way to select a random number of columns to scale. For example, if there were a selection entity, ie. Selection(0, 1000, 30) which would select 30 random columns in the range of 0 to 1000. I'm sure sure if this is feasible given how your program works but i think it would be an important feature to have.

Thanks again.

HunterMcGushion commented 5 years ago

I’m sorry, but I’m having some trouble tracking which of your five questions we’re talking about haha. I know it’s inconvenient, but it’d be very helpful if you could split these up into separate issues for all the questions that haven’t been answered yet. Doing this will also make it easier to keep track of any bug fixes or new features we make that relate to your questions. Does this sound ok to you?

I want to be careful here because all of your questions are great, and I want to make sure they're all addressed clearly. Then we can migrate parts of our conversation here to the appropriate new issues.

Am I correct in saying that I’ve at least answered your first question?

Is there support for activation functions not called by name? (lrelu for example?)

Or have I just embarrassed myself by not answering anything at all? Hahaha

HunterMcGushion commented 5 years ago

I may also be able to answer your fourth point:

For the error about params on callbacks, i think any time you use a float for a callback param you get an error. For example:

   model_extra_params=dict(
      callbacks=[
         ReduceLROnPlateau(
            monitor='loss', 
            patience=Integer(5, 25), 
            min_delta=Real(0.01, 0.0001),
            verbose=0
         )
      ]
   )

I believe that inside the ReduceLROnPlateau callback function there is comparison operators for min_delta like greater than that causes an error when it tries to compare a Real object to a float.

I received the following error using the above ReduceLROnPlateau configuration:

ValueError: Lower bound (0.01) must be less than the upper bound (0.0001)

This is because Real expects the lower bound to be the first argument, followed by the upper bound. So just switching your min_delta value from Real(0.01, 0.0001) to Real(0.0001, 0.01) did the trick for me.

Would you mind seeing if that solves your fourth issue, as well?

ben-arnao commented 5 years ago

I may also be able to answer your fourth point:

For the error about params on callbacks, i think any time you use a float for a callback param you get an error. For example:

   model_extra_params=dict(
      callbacks=[
         ReduceLROnPlateau(
            monitor='loss', 
            patience=Integer(5, 25), 
            min_delta=Real(0.01, 0.0001),
            verbose=0
         )
      ]
   )

I believe that inside the ReduceLROnPlateau callback function there is comparison operators for min_delta like greater than that causes an error when it tries to compare a Real object to a float.

I received the following error using the above ReduceLROnPlateau configuration:

ValueError: Lower bound (0.01) must be less than the upper bound (0.0001)

This is because Real expects the lower bound to be the first argument, followed by the upper bound. So just switching your min_delta value from Real(0.01, 0.0001) to Real(0.0001, 0.01) did the trick for me.

Would you mind seeing if that solves your fourth issue, as well?

Sure no problem i can definitely split these up. And try to clarify a little bit more.

I’m sorry, but I’m having some trouble tracking which of your five questions we’re talking about haha. I know it’s inconvenient, but it’d be very helpful if you could split these up into separate issues for all the questions that haven’t been answered yet. Doing this will also make it easier to keep track of any bug fixes or new features we make that relate to your questions. Does this sound ok to you?

I want to be careful here because all of your questions are great, and I want to make sure they're all addressed clearly. Then we can migrate parts of our conversation here to the appropriate new issues.

Am I correct in saying that I’ve at least answered your first question?

Is there support for activation functions not called by name? (lrelu for example?)

Or have I just embarrassed myself by not answering anything at all? Hahaha

Sure no problem i can definitely split these up. And try to clarify a little bit more.

HunterMcGushion commented 5 years ago

Sorry to comment on a closed issue. Thanks for splitting this into #181 and #182! Just wanted to clarify that using Real in ReduceLROnPlateau is working for you. I think there may have been a copy/paste mishap in the response as I’m seeing the same thing twice:

Would you mind seeing if that solves your fourth issue, as well?

Sure no problem i can definitely split these up. And try to clarify a little bit more.

I also wanted to check on your third question:

Is there a good way to select a random number of columns to feature engineer on? Ie. let’s say the optimal way to scale my data would be to only scale columns 1 and 10.

Were you able to get this working, or is this still an issue?

ben-arnao commented 5 years ago

Thanks for following up! Yes the Callback issue was resolved.. I must have been doing something wrong before, sorry for the false alarm.

As for the question of scaling optimization on a per feature basis, I think this is a much bigger/fundamental question as to how this could be done or if it is worth it. For now it is probably not something worth getting into. And feature scaling is a lot more intuitive so I wouldn't saying it's really necessary to include atm, one could pick the right the scalings themselves for most problems.

Maybe something to think about in the future.