BShakhovsky / PolyphonicPianoTranscription

Recurrent Neural Network for generating piano MIDI-files from audio (MP3, WAV, etc.)
https://magenta.tensorflow.org/onsets-frames
230 stars 41 forks source link

KeyError: 'val_loss' #5

Closed Glooring closed 2 years ago

Glooring commented 3 years ago

I'm running the code from '2 Training, Validation, Testing.ipynb' using a smaller part of the MAESTRO dataset.

After I run this:

def HistoryGraph(acc, title):
    plt.figure(figsize=(16, 16))

    ax1 = plt.subplot(2, 1, 1)
    plt.plot(range(1, len(hist['loss']) + 1), hist[         acc], color='#FF8000', linewidth=3, label='Training')
    plt.plot(range(1, len(hist['loss']) + 1), hist['val_' + acc], color='r', linewidth=3, label='Validation')
    plt.ylim(min(hist[acc][:-1] + hist['val_' + acc][:-1]), max(hist[acc] + hist['val_' + acc]))
    plt.title(title)

    ax2 = plt.subplot(2, 1, 2)
    plt.plot(range(1, len(hist['loss']) + 1), hist[    'loss'], color='g', linewidth=3, label='Training')
    plt.plot(range(1, len(hist['loss']) + 1), hist['val_loss'], color='b', linewidth=3, label='Validation')
    plt.ylim(min(hist['loss'] + hist['val_loss']), max(hist['loss'][:-1] + hist['val_loss'][:-1]))
    plt.title('Loss history')

    minValLoss, maxValAcc = min(hist['val_loss']), max(hist['val_' + acc])
    lossInd, accInd = hist['val_loss'].index(minValLoss), hist['val_' + acc].index(maxValAcc)
    accTrain, accVal = hist[acc][lossInd], hist['val_' + acc][lossInd]
    for a in [ax1, ax2]:
        a.vlines(lossInd + 1, 0, 12, 'b', linewidth=3,
                   label='Min validation loss, validation {} = {:.2%}'.format(acc, hist['val_' + acc][lossInd]))
        a.vlines(accInd + 1, 0, 12, 'r', linewidth=3,
                   label=                 'Max validation {} = {:.2%}'.format(acc, maxValAcc))
        a.legend()
        a.set_xlabel('Epoch')
        a.set_xlim(1, len(hist['loss']))
        a.grid()

    print('Maximum validation {0} = {1:.2%}, but the chosen ones are at the minimum validation loss:\n'
          'Train = {2:.2%}, Validation = {3:.2%}'.format(acc, maxValAcc, accTrain, accVal))
    return accTrain, accVal

accTrain, accVal = HistoryGraph('Dixon', 'Dixon Accuracy ("stricter than F1-score")')

I get this error:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-43-944ec576df2e> in <module>()
----> 1 accTrain, accVal = HistoryGraph('Dixon', 'Dixon Accuracy ("stricter than F1-score")')

<ipython-input-42-5c9e4e9dbe31> in HistoryGraph(acc, title)
     10     ax2 = plt.subplot(2, 1, 2)
     11     plt.plot(range(1, len(hist['loss']) + 1), hist[    'loss'], color='g', linewidth=3, label='Training')
---> 12     plt.plot(range(1, len(hist['loss']) + 1), hist['val_loss'], color='b', linewidth=3, label='Validation')
     13     plt.ylim(min(hist['loss'] + hist['val_loss']), max(hist['loss'][:-1] + hist['val_loss'][:-1]))
     14     plt.title('Loss history')

KeyError: 'val_loss'

I noticed that when I print hist dictionary I get this:

{'Dixon': [0.020296886563301086, 0.01974172331392765],
 'loss': [2.3790948390960693, 2.51298451423645],
 'val_Dixon': [0.019699351862072945, 0.021455474197864532]}

I don't know why val_loss is not in the dictionary.

For me val_loss is neither reported at the and of any epoch.

2365/2365 [==============================] - 442s 163ms/step - loss: 2.2771 - Dixon: 0.0228 - val_Dixon: 0.0197
2365/2365 [==============================] - 402s 170ms/step - loss: 2.5130 - Dixon: 0.0197 - val_Dixon: 0.0215

I searched on Google for the error but I couldn't figure out how to fix it.

BShakhovsky commented 3 years ago

Oops, my mistake. I did not print val_loss, so that each epoch fits on one line. Previously it was possible not to print it, but it was still part of the dictionary. Now, with updated libraries, I do not know how to do it.

So, to fix this error you should remove callbacks=[ValDixon_ProgbarLogger()] in model.fit(...). There are two calls of model.fit(...) in TrainAndSave function. I have just updated the template as well.

Glooring commented 3 years ago

Thank you!

BShakhovsky commented 3 years ago

Hi Dan.

I cannot find your comment about the number of epochs. Probably, you have already found out that you can simply run the cell again, and another epoch will begin. Or you can add epochs=… parameter to model.fit function with the number of epochs you need. I made that it stops training if loss becomes higher than in previous epoch.

By the way, I found and corrected another mistake - categorical_crossentropy loss function does not work for me anymore, and the model training is stuck at high loss and low accuracy. If it is the same in your case, binary_crossentropy loss function should be used instead.

Glooring commented 3 years ago

I realized that I could increase the size of the batch and I thought that was the problem initially. I changed for my dataset from folder 2015, the size of batch to 64 and the number of epochs to 10. I got loss = 0.0058 and Dixon 0.719

Epoch 1/10
148/148 [==============================] - 191s 879ms/step - loss: 0.0943 - Dixon: 9.8013e-04 - val_loss: 0.0220 - val_Dixon: 0.0623
Epoch 2/10
148/148 [==============================] - 125s 844ms/step - loss: 0.0180 - Dixon: 0.2742 - val_loss: 0.0128 - val_Dixon: 0.4034
Epoch 3/10
148/148 [==============================] - 125s 844ms/step - loss: 0.0113 - Dixon: 0.5190 - val_loss: 0.0101 - val_Dixon: 0.4997
Epoch 4/10
148/148 [==============================] - 125s 844ms/step - loss: 0.0092 - Dixon: 0.5961 - val_loss: 0.0082 - val_Dixon: 0.6180
Epoch 5/10
148/148 [==============================] - 125s 845ms/step - loss: 0.0083 - Dixon: 0.6275 - val_loss: 0.0085 - val_Dixon: 0.5910
Epoch 6/10
148/148 [==============================] - 125s 845ms/step - loss: 0.0075 - Dixon: 0.6554 - val_loss: 0.0073 - val_Dixon: 0.6697
Epoch 7/10
148/148 [==============================] - 125s 845ms/step - loss: 0.0072 - Dixon: 0.6707 - val_loss: 0.0072 - val_Dixon: 0.6820
Epoch 8/10
148/148 [==============================] - 125s 844ms/step - loss: 0.0068 - Dixon: 0.6849 - val_loss: 0.0071 - val_Dixon: 0.6628
Epoch 9/10
148/148 [==============================] - 125s 846ms/step - loss: 0.0064 - Dixon: 0.6989 - val_loss: 0.0069 - val_Dixon: 0.6750
Epoch 10/10
148/148 [==============================] - 125s 846ms/step - loss: 0.0061 - Dixon: 0.7069 - val_loss: 0.0068 - val_Dixon: 0.6886
148/148 [==============================] - 126s 849ms/step - loss: 0.0058 - Dixon: 0.7192 - val_loss: 0.0070 - val_Dixon: 0.6867

The categorical_crossentropy didn't work for me either on my last run.

Thanks a lot!

I have another error. When I run this cell:

offsetsModel = GetModel('Offsets', {'Dixon': Dixon}, accTrain, accVal, melsVal, onsetsVal, 32, True, lossMetric)[0]
offsetsPredTrain, offsetsPredVal = map(lambda x: offsetsModel.predict(x, 32, 1), [melsTrain, melsVal])
'{} {:.3f} {:.1e}    {} {:.3f} {:.1e}'.format(offsetsPredTrain.min(), offsetsPredTrain.mean(), offsetsPredTrain.max(),
                                              offsetsPredVal.min(),   offsetsPredVal.mean(),   offsetsPredVal.max())

I get this error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-40-7408d5fdddff> in <module>()
----> 1 offsetsModel = GetModel('Offsets', {'Dixon': Dixon}, accTrain, accVal, melsVal, onsetsVal, 32, True, lossMetric)[0]
      2 offsetsPredTrain, offsetsPredVal = map(lambda x: offsetsModel.predict(x, 32, 1), [melsTrain, melsVal])
      3 '{} {:.3f} {:.1e}    {} {:.3f} {:.1e}'.format(offsetsPredTrain.min(), offsetsPredTrain.mean(), offsetsPredTrain.max(),
      4                                               offsetsPredVal.min(),   offsetsPredVal.mean(),   offsetsPredVal.max())

2 frames
<ipython-input-22-514c3c8625ed> in GetModel(name, accs, accTrain, accVal, xVal, yVal, evalBatchSize, withLstm, loss)
     10     if exists('{}/Training {} Model {:.2f} {:.2f}.hdf5'.format(modelFolder, name, accTrain * 100, accVal * 100))             and exists('{}/Training {} History {:.2f} {:.2f}.npy'.format(modelFolder, name, accTrain * 100, accVal * 100)):
     11         hist = np.load('{}/Training {} History {:.2f} {:.2f}.npy'.format(modelFolder, name, accTrain * 100, accVal * 100)
---> 12                        , 'r', allow_pickle=True
     13                       )[0]
     14         print('Loading pre-trained {} model...'.format(name))

/usr/local/lib/python3.7/dist-packages/numpy/lib/npyio.py in load(file, mmap_mode, allow_pickle, fix_imports, encoding)
    435             # .npy file
    436             if mmap_mode:
--> 437                 return format.open_memmap(file, mode=mmap_mode)
    438             else:
    439                 return format.read_array(fid, allow_pickle=allow_pickle,

/usr/local/lib/python3.7/dist-packages/numpy/lib/format.py in open_memmap(filename, mode, dtype, shape, fortran_order, version)
    856             if dtype.hasobject:
    857                 msg = "Array can't be memory-mapped: Python objects in dtype."
--> 858                 raise ValueError(msg)
    859             offset = fp.tell()
    860 

ValueError: Array can't be memory-mapped: Python objects in dtype.
BShakhovsky commented 3 years ago

Ok, the problem is in 'r' argument in np.load function, if you remove it, the problem should be gone:

hist = np.load('{}/Training {} History {:.2f} {:.2f}.npy'.format(modelFolder, name, accTrain 100, accVal 100) --->           ,'r', allow_pickle=True # This 'r' argument results in ValueError: Array can't be memory-mapped: Python objects in dtype.

Glooring commented 3 years ago

Thanks!

Glooring commented 3 years ago

Hi, Boris

I got another error. When I try to load the pre-trained Volumesc model:

def VolAcc(yTrue, yPred):
    onsets = K.cast(yTrue > K.epsilon(), float)
    yPredOnsets, numNotes = yPred * onsets, K.sum(onsets)

    # Linear regression:
    sumX, sumY = map(K.sum, (yPredOnsets, yTrue))
    m = (numNotes * K.sum(yPredOnsets * yTrue) - sumX * sumY) / (numNotes * K.sum(yPredOnsets ** 2) - sumX ** 2)
    yPredOnsets = (m * yPredOnsets + (sumY - m * sumX) / numNotes) * onsets

    return (numNotes - K.sum(K.cast(K.abs(yPredOnsets - yTrue) > .1, float))) / numNotes

def VolLoss(yTrue, yPred): return mean_squared_error(yTrue, yPred * K.cast(yTrue > K.epsilon(), float))

model, hist = GetModel('Volumes', {'VolAcc': VolAcc}, 98.12 / 100, 97.01 / 100, melsVal, volumesVal, 32, False, VolLoss)
PlotModel()

I get this error:

Loading pre-trained Volumes model...
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-45-15cd2a0fe631> in <module>()
     12 def VolLoss(yTrue, yPred): return mean_squared_error(yTrue, yPred * K.cast(yTrue > K.epsilon(), float))
     13 
---> 14 model, hist = GetModel('Volumes', {'VolAcc': VolAcc}, 98.12 / 100, 97.01 / 100, melsVal, volumesVal, 32, False, VolLoss)
     15 PlotModel()
     16 

6 frames
<ipython-input-24-7da32d8b1a0d> in GetModel(name, accs, accTrain, accVal, xVal, yVal, evalBatchSize, withLstm, loss)
     13                       )[0]
     14         print('Loading pre-trained {} model...'.format(name))
---> 15         model = load_model('{}/Training {} Model {:.2f} {:.2f}.hdf5'.format(modelFolder, name, accTrain * 100, accVal * 100), accs)
     16         print('Spent {} epochs, current validation loss and {} are:'.format(len(hist['loss']), list(accs.keys())[0]))
     17         #print(model.evaluate(xVal, yVal, evalBatchSize, 1))

/usr/local/lib/python3.7/dist-packages/keras/saving/save.py in load_model(filepath, custom_objects, compile, options)
    200             (isinstance(filepath, h5py.File) or h5py.is_hdf5(filepath))):
    201           return hdf5_format.load_model_from_hdf5(filepath, custom_objects,
--> 202                                                   compile)
    203 
    204         filepath = path_to_string(filepath)

/usr/local/lib/python3.7/dist-packages/keras/saving/hdf5_format.py in load_model_from_hdf5(filepath, custom_objects, compile)
    197       # Compile model.
    198       model.compile(**saving_utils.compile_args_from_training_config(
--> 199           training_config, custom_objects), from_serialized=True)
    200       saving_utils.try_build_compiled_arguments(model)
    201 

/usr/local/lib/python3.7/dist-packages/keras/saving/saving_utils.py in compile_args_from_training_config(training_config, custom_objects)
    210     loss_config = training_config.get('loss', None)
    211     if loss_config is not None:
--> 212       loss = _deserialize_nested_config(losses.deserialize, loss_config)
    213 
    214     # Recover metrics.

/usr/local/lib/python3.7/dist-packages/keras/saving/saving_utils.py in _deserialize_nested_config(deserialize_fn, config)
    251     return None
    252   if _is_single_object(config):
--> 253     return deserialize_fn(config)
    254   elif isinstance(config, dict):
    255     return {

/usr/local/lib/python3.7/dist-packages/keras/losses.py in deserialize(name, custom_objects)
   2022       module_objects=globals(),
   2023       custom_objects=custom_objects,
-> 2024       printable_module_name='loss function')
   2025 
   2026 

/usr/local/lib/python3.7/dist-packages/keras/utils/generic_utils.py in deserialize_keras_object(identifier, module_objects, custom_objects, printable_module_name)
    701             'https://www.tensorflow.org/guide/keras/save_and_serialize'
    702             '#registering_the_custom_object for details.'
--> 703             .format(printable_module_name, object_name))
    704 
    705     # Classes passed by name are instantiated with no args, functions are

ValueError: Unknown loss function: VolLoss. Please ensure this object is passed to the `custom_objects` argument. See https://www.tensorflow.org/guide/keras/save_and_serialize#registering_the_custom_object for details.
BShakhovsky commented 3 years ago

It seems that if model is loaded with compile=False argument, and then compiled again, it will solve the issue:

def GetModel(…):
    …
    model = load_model(…, compile=False)
    model.compile(…)
    …

I updated the template.

Glooring commented 3 years ago

Thank you!