autonomio / talos

Hyperparameter Experiments with TensorFlow and Keras
https://autonom.io
MIT License
1.62k stars 270 forks source link

ValueError: setting an array element with a sequence #429

Closed DavideRutigliano closed 4 years ago

DavideRutigliano commented 4 years ago

Hi All,

I'm using talos for hyperparameter optimization and everything was working fine. Now i get an error (probably regarding the model or the history). Here the full output:

` 0%| | 0/16 [00:00<?, ?it/s]{'weights': [2.0, 1.0, 1.0, 1.0, 1.0, 1.0], 'dropout_ratio': 0.25, 'batch_size': 5, 'optimizer': 'adam', 'init_p': 0.01, 'ratio': 1, 'dense_layer_size': 32, 'input_shape': (3, 224, 224), 'lr': 0.0001, 'epochs': 3, 'base_model': 'resnet50', 'gamma': 0.5, 'alpha': 3} Epoch 1/3 Epoch 1/3

Epoch 00001: val_f1_score improved from -inf to 0.68685, saving model to model_2_weights.h5 1/1 [==============================] - 4s 4s/step - loss: 0.0427 - log_loss: 0.0088 - accuracy: 0.2000 - f1_score: 0.9582 - val_loss: 1.4823 - val_log_loss: 0.3988 - val_accuracy: 0.0000e+00 - val_f1_score: 0.6869 Epoch 2/3

Epoch 00002: val_f1_score improved from 0.68685 to 0.68702, saving model to model_2_weights.h5 1/1 [==============================] - 2s 2s/step - loss: 0.0386 - log_loss: 0.0079 - accuracy: 0.0000e+00 - f1_score: 0.9621 - val_loss: 1.4839 - val_log_loss: 0.3993 - val_accuracy: 0.0000e+00 - val_f1_score: 0.6870 Epoch 3/3

Epoch 00003: val_f1_score improved from 0.68702 to 0.68718, saving model to model_2_weights.h5 1/1 [==============================] - 2s 2s/step - loss: 0.0424 - log_loss: 0.0087 - accuracy: 0.0000e+00 - f1_score: 0.9586 - val_loss: 1.4838 - val_log_loss: 0.3993 - val_accuracy: 0.0000e+00 - val_f1_score: 0.6872

ValueError Traceback (most recent call last)

in 34 reduction_metric='val_f1_score', 35 experiment_name="rsna", ---> 36 print_params=True) ~/.local/lib/python3.5/site-packages/talos/scan/Scan.py in __init__(self, x, y, params, model, experiment_name, x_val, y_val, val_split, random_method, seed, performance_target, fraction_limit, round_limit, time_limit, boolean_limit, reduction_method, reduction_interval, reduction_window, reduction_threshold, reduction_metric, minimize_loss, disable_progress_bar, print_params, clear_session, save_weights) 194 # start runtime 195 from .scan_run import scan_run --> 196 scan_run(self) ~/.local/lib/python3.5/site-packages/talos/scan/scan_run.py in scan_run(self) 24 # otherwise proceed with next permutation 25 from .scan_round import scan_round ---> 26 self = scan_round(self) 27 self.pbar.update(1) 28 ~/.local/lib/python3.5/site-packages/talos/scan/scan_round.py in scan_round(self) 22 # handle logging of results 23 from ..logging.logging_run import logging_run ---> 24 self = logging_run(self, round_start, start, self.model_history) 25 26 # apply reductions ~/.local/lib/python3.5/site-packages/talos/logging/logging_run.py in logging_run(self, round_start, start, model_history) 40 41 from .results import save_result ---> 42 save_result(self) 43 44 # return the Scan() self ~/.local/lib/python3.5/site-packages/talos/logging/results.py in save_result(self) 34 self.result, 35 fmt='%s', ---> 36 delimiter=',') 37 38 1 frames /usr/local/lib/python3.5/site-packages/numpy/lib/npyio.py in savetxt(fname, X, fmt, delimiter, newline, header, footer, comments, encoding) 1328 1329 try: -> 1330 X = np.asarray(X) 1331 1332 # Handle 1-dimensional arrays /usr/local/lib/python3.5/site-packages/numpy/core/numeric.py in asarray(a, dtype, order) 499 500 """ --> 501 return array(a, dtype, copy=False, order=order) 502 503 **ValueError: setting an array element with a sequence**` Here i don't understand which is the "sequence" that should be an array. Here i post also the code for building and training the model: `def build_model(X, y, X_val, y_val, params): global train_dir global classes input_shape = params['input_shape'] weights = params['weights'] num_classes = len(classes) train_data = pd.DataFrame(y, index=X, columns=classes) validation_data = pd.DataFrame(y_val, index=X_val, columns=classes) ratio = params['ratio'] epochs = params['epochs'] batch_size = params['batch_size'] lr = params['lr'] dense_layer_size = params['dense_layer_size'] dropout_ratio = params['dropout_ratio'] alpha = params['alpha'] gamma = params['gamma'] init_p = params['init_p'] if params['base_model'] == 'resnet50': base_model = ResNet50(weights='imagenet', include_top=False, input_shape=input_shape) elif params['base_model'] == 'efficientnet-b0': base_model = EfficientNetB0(weights='imagenet', include_top=False, input_shape=input_shape) else: return None for layer in base_model.layers: layer.trainable = False if layer.name.startswith('bn'): layer.call(layer.input, training=False) input = Input(shape=input_shape) out = base_model(input) out = GlobalAveragePooling2D()(out) out = Dense(dense_layer_size)(out) out = Activation("relu")(out) out = Dropout(dropout_ratio)(out) out = Dense(num_classes, use_bias=True, bias_initializer=bias_init(init_p))(out) out = Activation("sigmoid")(out) model = Model(inputs=input, outputs=out) if params['optimizer'] == 'adam': optimizer = Adam(lr=lr) elif params['optimizer'] == 'sgd': optimizer = SGD(lr=lr) else: return None optimizer = tf.train.experimental.\ enable_mixed_precision_graph_rewrite(optimizer) model.compile(optimizer=optimizer, loss=weighted_focal_loss(weights=weights, alpha=alpha, gamma=gamma), metrics=[weighted_log_loss(weights=weights), 'accuracy', weighted_f1_score(weights=weights)]) #model.summary() train_generator = DataGenerator(train_dir, ids=X, labels=y, #train_data.copy(), num_classes=num_classes, batch_size=batch_size, input_shape=input_shape, resample=True, ratio=ratio) train_steps = train_generator.n validation_generator = DataGenerator(train_dir, ids=X_val, labels=y_val, #validation_data.copy(), num_classes=num_classes, batch_size=batch_size, input_shape=input_shape, resample=True, ratio=ratio) validation_steps = validation_generator.n #calculate_scores = CalculateScores(validation_generator, weights=weights) early_stop = EarlyStopping(monitor='val_f1_score', mode='max', patience=2, verbose=1) checkpoint = ModelCheckpoint(model.name + '_weights.h5', monitor='val_f1_score', mode='max', save_best_only=True, verbose=1) reduce_lr = ReduceLROnPlateau(monitor='val_f1_score', mode="max", min_lr=1e-15, factor=0.5, patience=1, verbose=1) terminate = TerminateOnNaN() callbacks = [terminate, early_stop, checkpoint, reduce_lr]#, #calculate_scores] score = model.fit_generator(train_generator, steps_per_epoch=train_steps, epochs=epochs, validation_data=validation_generator, validation_steps=validation_steps, callbacks=callbacks, use_multiprocessing=True, max_queue_size=90, workers=mp.cpu_count()) return score, model` And this is the code to use talos: `params = { 'base_model' : ['resnet50', 'efficientnet-b0'], 'lr' : [1e-4, 1e-6], 'optimizer' : ['adam', 'sgd'], 'gamma' : [.5, 2], 'epochs': [3], 'batch_size' : [5], 'input_shape' : [(3, 224, 224)], 'ratio' : [1], 'dense_layer_size' : [32], 'dropout_ratio': [.25], 'alpha' : [3], 'init_p' : [0.01], 'weights' : [[2., 1., 1., 1., 1., 1.]], } X = train_data.index.tolist() y = train_data.values.tolist() X_val = validation_data.index.tolist() y_val = validation_data.values.tolist() scan = talos.Scan(x=X[:5], y=y[:5], x_val=X_val[:5], y_val=y_val[:5], params=params, model=build_model, reduction_metric='val_f1_score', experiment_name="rsna", print_params=True)` I checked talos source code that uses "score" and "model" variables and mine looks fine (i.e. as they should), thus i don't know even which is causing the problem. Does anyone have idea of what's going on and how to fix it? Thanks in advance.
github-actions[bot] commented 4 years ago

Welcome to Talos community! Thanks so much for creating your first issue :)

DavideRutigliano commented 4 years ago

Code from "logging.py" where the error is thrown:

`

capture the history keys for later

    self._all_keys = list(model_history.history.keys())
    self._metric_keys = [k for k in self._all_keys if 'val_' not in k]
    self._val_keys = [k for k in self._all_keys if 'val_' in k]

    # create a header column for output
    _results_header = ['round_epochs'] + self._all_keys + self._param_dict_keys
    self.result.append(_results_header)

    # save the results
    from .results import save_result
    save_result(self)

`

When calling save results there's something broken with "self.results":

`def save_result(self): import numpy as np

np.savetxt(self._experiment_log,
           self.result,
           fmt='%s',
           delimiter=',')

` My ValueError is thrown at -> 1330 X = np.asarray(X) of np.savetxt, where X is self.result in the calling method.

mikkokotila commented 4 years ago

It might be because of your parameter dictionary. Try this so that each value is different parameter instead of being a tuple:

'input_shape' : [(3, 224, 224)],

This will definitely likely not work:

weights' : [[2., 1., 1., 1., 1., 1.]],