Text summarization using seq2seq and encoder-decoder recurrent networks in Keras
The follow neural network models are implemented and studied for text summarization:
The seq2seq models encodes the content of an article (encoder input) and one character (decoder input) from the summarized text to predict the next character in the summarized text
The implementation can be found in keras_text_summarization/library/seq2seq.py
There are three variants of seq2seq model implemented for the text summarization
There are currently 3 other encoder-decoder recurrent models based on some recommendation here
The implementation can be found in keras_text_summarization/library/rnn.py
The trained models are available in the demo/models folder
The demo below shows how to use seq2seq to do training and prediction, but other models described above also follow the same process of training and prediction.
To train a deep learning model, say Seq2SeqSummarizer, run the following commands:
pip install requirements.txt
cd demo
python seq2seq_train.py
The training code in seq2seq_train.py is quite straightforward and illustrated below:
from __future__ import print_function
import pandas as pd
from sklearn.model_selection import train_test_split
from keras_text_summarization.library.utility.plot_utils import plot_and_save_history
from keras_text_summarization.library.seq2seq import Seq2SeqSummarizer
from keras_text_summarization.library.applications.fake_news_loader import fit_text
import numpy as np
LOAD_EXISTING_WEIGHTS = True
np.random.seed(42)
data_dir_path = './data'
report_dir_path = './reports'
model_dir_path = './models'
print('loading csv file ...')
df = pd.read_csv(data_dir_path + "/fake_or_real_news.csv")
print('extract configuration from input texts ...')
Y = df.title
X = df['text']
config = fit_text(X, Y)
summarizer = Seq2SeqSummarizer(config)
if LOAD_EXISTING_WEIGHTS:
summarizer.load_weights(weight_file_path=Seq2SeqSummarizer.get_weight_file_path(model_dir_path=model_dir_path))
Xtrain, Xtest, Ytrain, Ytest = train_test_split(X, Y, test_size=0.2, random_state=42)
history = summarizer.fit(Xtrain, Ytrain, Xtest, Ytest, epochs=100)
history_plot_file_path = report_dir_path + '/' + Seq2SeqSummarizer.model_name + '-history.png'
if LOAD_EXISTING_WEIGHTS:
history_plot_file_path = report_dir_path + '/' + Seq2SeqSummarizer.model_name + '-history-v' + str(summarizer.version) + '.png'
plot_and_save_history(history, summarizer.model_name, history_plot_file_path, metrics={'loss', 'acc'})
After the training is completed, the trained models will be saved as cf-v1-. in the video_classifier/demo/models.
To use the trained deep learning model to summarize an article, the following code demo how to do this:
from __future__ import print_function
import pandas as pd
from keras_text_summarization.library.seq2seq import Seq2SeqSummarizer
import numpy as np
np.random.seed(42)
data_dir_path = './data' # refers to the demo/data folder
model_dir_path = './models' # refers to the demo/models folder
print('loading csv file ...')
df = pd.read_csv(data_dir_path + "/fake_or_real_news.csv")
X = df['text']
Y = df.title
config = np.load(Seq2SeqSummarizer.get_config_file_path(model_dir_path=model_dir_path)).item()
summarizer = Seq2SeqSummarizer(config)
summarizer.load_weights(weight_file_path=Seq2SeqSummarizer.get_weight_file_path(model_dir_path=model_dir_path))
print('start predicting ...')
for i in range(20):
x = X[i]
actual_headline = Y[i]
headline = summarizer.summarize(x)
print('Article: ', x)
print('Generated Headline: ', headline)
print('Original Headline: ', actual_headline)