enriqueav / lstm_lyrics

LSTM text generation by word. Used to generate lyrics from a corpus of a music genre.
https://medium.com/@enriqueav/word-level-lstm-text-generator-creating-automatic-song-lyrics-with-neural-networks-b8a1617104fb
MIT License
80 stars 27 forks source link

lstm_train_embedded stops after X epochs #11

Open slyons opened 5 years ago

slyons commented 5 years ago

Following the examples in the readme but using my own input set results in the script just..ending after 24 completed epochs. No output or errors.

Epoch 24/100
123/123 [==============================] - 4s 35ms/step - loss: 0.0356 - acc: 0.9992 - val_loss: 7.1317 - val_acc: 0.1250
root@tf2-6d4dd96d7b-klz2t:/notebooks/

The first time I ran the script I had encoding issues, now it's just ending.

slyons commented 5 years ago

Just tried again and it ended after 6 epochs.

slyons commented 5 years ago

Running with python3 -i results in:

Epoch 13/100
123/123 [==============================] - 19s 153ms/step - loss: 3.2521 - acc: 0.2736 - val_loss: 5.9371 - val_acc: 0.0312
Epoch 14/100
123/123 [==============================] - 20s 163ms/step - loss: 3.0039 - acc: 0.3107 - val_loss: 6.0639 - val_acc: 0.0521
>>> dir()
['Activation', 'BATCH_SIZE', 'Bidirectional', 'Dense', 'Dropout', 'EarlyStopping', 'LSTM', 'LambdaCallback', 'MIN_WORD_FREQUENCY', 'ModelCheckpoint', 'SEQUENCE_LEN', 'STEP', 'Sequential', '__builtins__', '__cached__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', 'callbacks_list', 'checkpoint', 'corpus', 'early_stopping', 'examples', 'examples_file', 'f', 'file_path', 'generator', 'get_model', 'i', 'ignored', 'ignored_words', 'indices_word', 'io', 'k', 'model', 'next_words', 'next_words_test', 'np', 'on_epoch_end', 'os', 'print_callback', 'print_function', 'sample', 'sentences', 'sentences_test', 'shuffle_and_split_training_set', 'sys', 'text', 'text_in_words', 'v', 'word', 'word_freq', 'word_indices', 'words']
enriqueav commented 5 years ago

It is possibly caused by the EarlyStopping https://keras.io/callbacks/#earlystopping. Try to run it commenting the line

early_stopping = EarlyStopping(monitor='val_acc', patience=20)

And removing it from the callbacks

callbacks_list = [checkpoint, print_callback, early_stopping]
slyons commented 5 years ago

That seems to have let things progress further, but I'm still running into the same Unicode errors that I have with other RNN type examples. Every source in my project is produced with .decode("utf-8") and opened in plain w mode, no binary.

    examples_file.write('----- Generating with seed:\n"' + ' '.join(sentence) + '"\n')
UnicodeEncodeError: 'ascii' codec can't encode character '\u2019' in position 68: ordinal not in range(128)
enriqueav commented 5 years ago

Try to create example_files with codecs. Add

import codecs

The modify

examples_file = open(examples, "w")

For

examples_file = codecs.open(examples, 'w', encoding='utf8')

And leave the rest as is. This may fix your problem.