NTMC-Community / MatchZoo

Facilitating the design, comparison and sharing of deep text matching models.
Apache License 2.0
3.85k stars 900 forks source link

Using callbacks for early stopping in DSSM #837

Open saekomdalkom opened 3 years ago

saekomdalkom commented 3 years ago

Describe the Question

Please provide a clear and concise description of what the question is.

Describe your attempts

You may also provide a Minimal, Complete, and Verifiable example you tried as a workaround, or StackOverflow solution that you have walked through. (e.g. cosmic radiation).

In addition, figure out your MatchZoo version by running import matchzoo; matchzoo.__version__. If this gives you an error, then you're probably using 1.0, and 1.0 is no longer supported. Then attach the corresponding label on the issue.



Hello, I'm trying to run DSSM code, and I want to use keras.callbacks.EarlyStopping with this code. I ran DSSM tutorial, and what I only changed was the last few lines.

Original code was like this,

train_generator = mz.DataGenerator(train_pack_processed, mode='pair', num_dup=1, num_neg=4, batch_size=32, shuffle=True)
len(train_generator)

history = model.fit_generator(train_generator, epochs=20, callbacks=[evaluate], workers=5, use_multiprocessing=False)

And what I changed was like this.

train_generator = mz.DataGenerator(train_pack_processed, mode='pair', num_dup=1, num_neg=4, batch_size=32, shuffle=True)
len(train_generator)

from keras.callbacks import EarlyStopping
from keras.callbacks import ModelCheckpoint
from keras.models import load_model

es = EarlyStopping(monitor='val_loss', mode='min', verbose=1)

history = model.fit_generator(train_generator, epochs=2000, callbacks=[evaluate, es], workers=5, use_multiprocessing=False)

And there was an error.

Epoch 1/2000
17/17 [==============================] - 2s 95ms/step - loss: 1.5384
Validation: normalized_discounted_cumulative_gain@3(0.0): 0.021622386820576125 - normalized_discounted_cumulative_gain@5(0.0): 0.029349502492551117 - mean_average_precision(0.0): 0.0341616525519229
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-17-5e8d82ce978c> in <module>()
----> 1 history = model.fit_generator(train_generator, epochs=2000, callbacks=[evaluate, es], workers=5, use_multiprocessing=False)

6 frames
/usr/local/lib/python3.6/dist-packages/matchzoo/engine/base_model.py in fit_generator(self, generator, epochs, verbose, **kwargs)
    274             generator=generator,
    275             epochs=epochs,
--> 276             verbose=verbose, **kwargs
    277         )
    278 

/usr/local/lib/python3.6/dist-packages/keras/legacy/interfaces.py in wrapper(*args, **kwargs)
     89                 warnings.warn('Update your `' + object_name + '` call to the ' +
     90                               'Keras 2 API: ' + signature, stacklevel=2)
---> 91             return func(*args, **kwargs)
     92         wrapper._original_function = func
     93         return wrapper

/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, validation_freq, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
   1730             use_multiprocessing=use_multiprocessing,
   1731             shuffle=shuffle,
-> 1732             initial_epoch=initial_epoch)
   1733 
   1734     @interfaces.legacy_generator_methods_support

/usr/local/lib/python3.6/dist-packages/keras/engine/training_generator.py in fit_generator(model, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, validation_freq, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
    258                     break
    259 
--> 260             callbacks.on_epoch_end(epoch, epoch_logs)
    261             epoch += 1
    262             if callbacks.model.stop_training:

/usr/local/lib/python3.6/dist-packages/keras/callbacks/callbacks.py in on_epoch_end(self, epoch, logs)
    150         logs = logs or {}
    151         for callback in self.callbacks:
--> 152             callback.on_epoch_end(epoch, logs)
    153 
    154     def on_train_batch_begin(self, batch, logs=None):

/usr/local/lib/python3.6/dist-packages/keras/callbacks/callbacks.py in on_epoch_end(self, epoch, logs)
    814 
    815     def on_epoch_end(self, epoch, logs=None):
--> 816         current = self.get_monitor_value(logs)
    817         if current is None:
    818             return

/usr/local/lib/python3.6/dist-packages/keras/callbacks/callbacks.py in get_monitor_value(self, logs)
    844                 'Early stopping conditioned on metric `%s` '
    845                 'which is not available. Available metrics are: %s' %
--> 846                 (self.monitor, ','.join(list(logs.keys()))), RuntimeWarning
    847             )
    848         return monitor_value

TypeError: sequence item 1: expected str instance, NormalizedDiscountedCumulativeGain found

I attach this sample code for early stopping.

# mlp overfit on the moons dataset with simple early stopping
from sklearn.datasets import make_moons
from keras.models import Sequential
from keras.layers import Dense
from keras.callbacks import EarlyStopping 
from matplotlib import pyplot
# generate 2d classification dataset
X, y = make_moons(n_samples=100, noise=0.2, random_state=1)
# split into train and test
n_train = 30
trainX, testX = X[:n_train, :], X[n_train:, :]
trainy, testy = y[:n_train], y[n_train:]
# define model
model = Sequential()
model.add(Dense(500, input_dim=2, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# simple early stopping
es = EarlyStopping(monitor='val_loss', mode='min', verbose=1) 
# fit model
history = model.fit(trainX, trainy, validation_data=(testX, testy), epochs=4000, verbose=1, callbacks=[es]) 
# evaluate the model
_, train_acc = model.evaluate(trainX, trainy, verbose=0)
_, test_acc = model.evaluate(testX, testy, verbose=0)
print('Train: %.3f, Test: %.3f' % (train_acc, test_acc))
# plot training history
pyplot.plot(history.history['loss'], label='train')
pyplot.plot(history.history['val_loss'], label='test')
pyplot.legend()
pyplot.show()

When I use keras.callbacks.EarlyStopping I need to set monitor argument, the criterion by which the model can determine it should stop or not. In the sample code above I'm choosing 'accuracy' metric and the performance is shown with loss, accuracy, val_loss, val_accuracy, so I can choose 'val_loss' as a monitor argument. In DSSM code the performance is shown by loss, 2 NDCG scores and MAP score. I want to ask how I can choose these scores to use early stopping callback. And I want to ask if there's any other way you use to stop training in proper time, since I'm not so much experienced with deep learning yet. Thank you in advance.