autonomio / talos

Hyperparameter Experiments with TensorFlow and Keras
https://autonom.io
MIT License
1.62k stars 270 forks source link

[FEATURE REQUEST] add support for custom layers in `best_model()` #421

Closed bjtho08 closed 2 years ago

bjtho08 commented 4 years ago

Overview

I built a model in Keras using the functional API. I also use the keras_contrib and keras_radam libraries to add new activations (Swish) and optimizers (RAdam) not yet implemented in keras. Talos initializes and trains all iterations of the model without issue, but if I want to recall the best model or deploy the model, it fails with an error from keras.utils.generic_utils.deserialize_keras_object(). The error in question is ValueError: Unknown layer: Swish.

Prerequisites

>>> talos.__version__
'0.6.0'

Expected behavior

scan_object.best_model(metric="acc") should result in a new instance of the best performing model.

Actual behavior

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-29-2e96fabca5ec> in <module>
      1 #ta.Deploy(t,"U-nets", metric="acc")
----> 2 scan_object.best_model(metric="acc")

~/.pyenv/versions/miniconda3-4.3.30/envs/tf_gpu/lib/python3.6/site-packages/talos/scan/scan_addon.py in func_best_model(scan_object, metric, asc)
     12     from ..utils.best_model import best_model, activate_model
     13     model_no = best_model(scan_object, metric, asc)
---> 14     out = activate_model(scan_object, model_no)
     15 
     16     return out

~/.pyenv/versions/miniconda3-4.3.30/envs/tf_gpu/lib/python3.6/site-packages/talos/utils/best_model.py in activate_model(self, model_id)
     18     '''Loads the model from the json that is stored in the Scan object'''
     19 
---> 20     model = model_from_json(self.saved_models[model_id])
     21     model.set_weights(self.saved_weights[model_id])
     22 

~/.pyenv/versions/miniconda3-4.3.30/envs/tf_gpu/lib/python3.6/site-packages/keras/engine/saving.py in model_from_json(json_string, custom_objects)
    490     config = json.loads(json_string)
    491     from ..layers import deserialize
--> 492     return deserialize(config, custom_objects=custom_objects)
    493 
    494 

~/.pyenv/versions/miniconda3-4.3.30/envs/tf_gpu/lib/python3.6/site-packages/keras/layers/__init__.py in deserialize(config, custom_objects)
     53                                     module_objects=globs,
     54                                     custom_objects=custom_objects,
---> 55                                     printable_module_name='layer')

~/.pyenv/versions/miniconda3-4.3.30/envs/tf_gpu/lib/python3.6/site-packages/keras/utils/generic_utils.py in deserialize_keras_object(identifier, module_objects, custom_objects, printable_module_name)
    143                     config['config'],
    144                     custom_objects=dict(list(_GLOBAL_CUSTOM_OBJECTS.items()) +
--> 145                                         list(custom_objects.items())))
    146             with CustomObjectScope(custom_objects):
    147                 return cls.from_config(config['config'])

~/.pyenv/versions/miniconda3-4.3.30/envs/tf_gpu/lib/python3.6/site-packages/keras/engine/network.py in from_config(cls, config, custom_objects)
   1020         # First, we create all layers and enqueue nodes to be processed
   1021         for layer_data in config['layers']:
-> 1022             process_layer(layer_data)
   1023         # Then we process nodes in order of layer depth.
   1024         # Nodes that cannot yet be processed (if the inbound node

~/.pyenv/versions/miniconda3-4.3.30/envs/tf_gpu/lib/python3.6/site-packages/keras/engine/network.py in process_layer(layer_data)
   1006 
   1007             layer = deserialize_layer(layer_data,
-> 1008                                       custom_objects=custom_objects)
   1009             created_layers[layer_name] = layer
   1010 

~/.pyenv/versions/miniconda3-4.3.30/envs/tf_gpu/lib/python3.6/site-packages/keras/layers/__init__.py in deserialize(config, custom_objects)
     53                                     module_objects=globs,
     54                                     custom_objects=custom_objects,
---> 55                                     printable_module_name='layer')

~/.pyenv/versions/miniconda3-4.3.30/envs/tf_gpu/lib/python3.6/site-packages/keras/utils/generic_utils.py in deserialize_keras_object(identifier, module_objects, custom_objects, printable_module_name)
    136             if cls is None:
    137                 raise ValueError('Unknown ' + printable_module_name +
--> 138                                  ': ' + class_name)
    139         if hasattr(cls, 'from_config'):
    140             custom_objects = custom_objects or {}

ValueError: Unknown layer: Swish

Model details

MWE ```python import keras from keras.models import Model from keras.layers import Input, Conv2D, BatchNormalization from keras.layers.advanced_activations import ReLU import talos as ta def u_net(shape, nb_filters=64, conv_size=3, init="glorot_uniform", activation=ReLU, output_channels=5): i = Input(shape, name="input_layer") n = Conv2D(nb_filters, conv_size, padding="same", kernel_initializer=init, name="block1_conv1")(i) n = activation(name="block1_{}1".format(activation.__name__))(n) n = BatchNormalization(name="block1_bn1")(n) n = Conv2D(nb_filters, conv_size, padding="same", kernel_initializer=init, name="block1_conv2")(n) n = activation(name="block1_{}2".format(activation.__name__))(n) n = BatchNormalization(name="block1_bn2")(n) o = Conv2D(output_channels, 1, activation="softmax", name="conv_out")(n) return Model(inputs=i, outputs=o) def talos_model(): model = u_net(SHAPE, nb_filters=p["nb_filters"], activation=p["act"]) model.compile(optimizer=p["opt"](lr=1e-4)) history = model.fit(x=X, y=Y) return model, history scan_object = ta.Scan(x=X, y=Y, model=talos_model, params=p) ```
parameter dictionary ```python # fit params from keras.optimizers import Adam from keras.layers.advanced_activations import ReLU from keras_radam import RAdam from keras_contrib.layers.advanced_activations.swish import Swish p = { "nb_filters": [12, 16, 32], "act": [Swish, ReLU], "opt": [RAdam, Adam] } ```

I chose to leave out sample data because it is not relevant to the issue at hand. For the same reason, I chose to create a Minimal Working Example rather than pasting the entire model, which is quite complicated and does not help to locate the issue.

mikkokotila commented 4 years ago

Thanks for taking the time to create a well-structured issue. I understand now what is the problem; Keras for some reason does not allow loading model from json if the name of an object (e.g. layer) is unknown to it.

It is anyhow possible to find the best models from the experiment and "recover" the models. Because this is kind of a case that can come up for several reasons, I've added a utility for doing it in a straightforward manner. You can access it by upgrading to v0.6.4 with:

pip install git+https://github.com/autonomio/talos

It will work on experiment log from any previous Talos version as well. I've created a code-complete notebook that outlines the process you can follow to get to cross-validated results for n models, as well as the models themselves. It does involve re-training n models, which depending on your architecture and system configuration may take longer than merely evaluating best models from the scan_object would.

Related with the underlying problem you suffer from, I'm not sure why Keras is doing a string check, as the meaningful part of the model is stored in the weights and not in the json. Might be a good idea to post this issue separately to Keras in case someone there warms up to the idea of making this go away.

bjtho08 commented 4 years ago

Thank you for the comprehensive response. The new feature is definitely a nice addition but because of the size of my network architecture, it would take a few days to re-train n models. I see that the model_from_json() accepts a parameter custom_objects, a dict of 3rd party objects used in the model. Do you think it is possible for Talos to run a check for custom objects before calling model_from_json() and add the objects along with their names to a dict that can get passed along with the model_id?

mikkokotila commented 4 years ago

@bjtho08 definitely, that's a great idea :) I'll change the title of the issue to correspond this scope.

bjtho08 commented 4 years ago

Awesome! I'll be looking forward to seeing the development and maybe pitch in, if I manage to find the time between finishing my phd dissertation and getting my data out of my models ;)

ruhollah2 commented 3 years ago

@bjtho08 , how did you end up solving this problem? I have exactly the same issue with scan_object.best_model() in talos as my model has custom layer.

What I tried: First, I Analyze the scan_object to find the round w/ best result using rounds2high. Say we store the best model number into model_no. Then I tried the following, but it failed:

ruh_best_model = talos.utils.best_model.activate_model({scan_object,{'myCsutomLayer': myCsutomLayer}} , model_no) due to sytaxterr:

SyntaxError: positional argument follows keyword argument

Python 3.7.6
talos.__version__
'1.0.0'
bjtho08 commented 3 years ago

What I tried: First, I Analyze the scan_object to find the round w/ best result using rounds2high. Say we store the best model number into model_no. Then I tried the following, but it failed:

ruh_best_model = talos.utils.best_model.activate_model({scan_object,{'myCsutomLayer': myCsutomLayer}} , model_no) due to sytaxterr:

You have quite a few erros in that single line of code. best_model() and activate_model() are two functions from the same module. That means you can't chain them together as you did here. That would only work if activate_model() was a method in a class instance named best_model. Second, activate_model() takes one positional argument. Now, I don't know how you get that specific syntax error, because you are supplying two arguments, where the first argument is a set containing a Scan object and a dict with a custom layer as the item and the second argument is the model number.

Now, I don't have direct access to my own code atm, but if I remember correctly, what I did as a work-around for this particular issue was to simply use the underlying code for the various methods. So for instance, to get the best model and then re-load it, I used

from keras.models import model_from_json
from myswish import Swish

my_activations = {
    'Swish': Swish,
}

best = scan_object.data.sort_values('val_acc', ascending=False).iloc[0].name
model = model_from_json(self.saved_models[best], custom_objects=my_activations)

Then you just substitute whatever your dict of custom objects is named and change the sorting column if needed.