maxpumperla / hyperas

Keras + Hyperopt: A very simple wrapper for convenient hyperparameter optimization
http://maxpumperla.com/hyperas/
MIT License
2.18k stars 318 forks source link

Error when choosing between sets of advanced activations layers #32

Open ismaeIfm opened 8 years ago

ismaeIfm commented 8 years ago

Hi, I am optimizing the number of neurons and the activation function of the layers of a two hidden-layer network for the mnist dataset, but when I tried choosing from a set of advanced activations from keras I get the following error:

ValueError: Incompatible shapes for broadcasting: (?, 128) and (8,)

Here is my model's definition:

model = Sequential()

model.add(Dense({{choice([8, 128])}}, input_dim=784))
model.add({{choice([advanced_activations.ThresholdedReLU(), advanced_activations.SReLU()])}})

model.add(Dense({{choice([8, 128])}}))
model.add({{choice([advanced_activations.ThresholdedReLU(), advanced_activations.SReLU()])}})

model.add(Dense(10))
model.add(Activation('softmax'))

I think it is a issue on how hyperas manages the space of the parameters to tune.

maxpumperla commented 8 years ago

Hi @ismaeIfm, that is a very interesting finding, thanks for that. In fact, I translated the example into pure keras + hyperopt and the problem persists. As hyperas just bridges the two, it's not directly a hyperas issue, but I still want to understand it better. It seems shape inference fails for some reason, but right now I'm not sure why.

ismaeIfm commented 8 years ago

@maxpumperla maybe I'm wrong, but I noticed that layers are recycled, ie, if an evaluation of hyperas has initialized the advanced layer i as SRELU, as a result of choice(), and later in other evaluation the layer i gets SRELU, the memory address for the layer i (SRELU) is the same in both evaluations.

So if I'm correct, I'm guessing that the error comes from recycling the layer, because the size of the input for the layer differs between evaluations.

ghost commented 7 years ago

@maxpumperla @ismaeIfm I confronted with this issue as well...I will focus more on this issue to find the reason...

maxpumperla commented 7 years ago

@ismaeIfm that makes perfect sense, but I have currently no idea how to circumvent that. Might have to dig deeper into hyperopt for that.

ghost commented 7 years ago

Could you please kindly make hyperas support advanced layers as they are very helpful in some sort of problems...

chleibig commented 7 years ago

Hi and thanks for this handy wrapper. You can actually get around recycling the layers (for me this solves the issue) by taking control over the layer instantiation. Instead of choosing between instantiated layers you can define a function that switches between the different layers and then let "choice" act on names only. For the example above this would mean:

def activation(name):
    if name == 'ThresholdedReLU':
        return advanced_activations.ThresholdedReLU()
    if name == 'SReLU':
        return advanced_activations.SReLU()

The line(s)

model.add({{choice([advanced_activations.ThresholdedReLU(), advanced_activations.SReLU()])}})

would then be replaced by:

model.add(activation({{choice(['ThresholdedReLU', 'SReLU'])}}))
maxpumperla commented 7 years ago

@chleibig cool, that's immensely helpful. Thanks! A lot of people seem to have problems with that.

@ErmiaAzarkhalili does that work for you, dude?

ghost commented 7 years ago

Yes, thanks Max, it was awesome...