I edited some of the code for my own usage. However, every epoch is training too slow (2 mins per epoch) whereas, without editing, it takes less than 1 second per epode. I have a numpy array x_train of shape (209, 20000) and numpy array y_train of shape (209, 2). Could it be that my numpy array is too big?
What I changed from the code is using my own preprocessed data instead of data_helpers.py. I read in strings of size 20000 from 2 text files and put them into a list of length 209. Then, I load the data in train.py and change the list into a numpy array, and reshaping it after:
x = np.array(x_text)
x = np.reshape(x, (209, 20000))
The function trainstep is the one that is taking too much time. The line being:
, step, summaries, loss, accuracy = sess.run(
[train_op, global_step, train_summary_op, cnn.loss, cnn.accuracy],
feed_dict)
def train_step(x_batch, y_batch):
"""
A single training step
"""
feed_dict = {
cnn.input_x: x_batch,
cnn.input_y: y_batch,
cnn.dropout_keep_prob: FLAGS.dropout_keep_prob
}
'''The slow part is the line below'''
_, step, summaries, loss, accuracy = sess.run(
[train_op, global_step, train_summary_op, cnn.loss, cnn.accuracy],
feed_dict)
time_str = datetime.datetime.now().isoformat()
print("{}: step {}, loss {:g}, acc {:g}".format(time_str, step, loss, accuracy))
train_summary_writer.add_summary(summaries, step)
# Generate batches
batches = data_helpers.batch_iter(
list(zip(x_train, y_train)), FLAGS.batch_size, FLAGS.num_epochs)
# Training loop. For each batch...
for batch in batches:
x_batch, y_batch = zip(*batch)
train_step(x_batch, y_batch)
current_step = tf.train.global_step(sess, global_step)
if current_step % FLAGS.evaluate_every == 0:
print("\nEvaluation:")
dev_step(x_dev, y_dev, writer=dev_summary_writer)
print("")
if current_step % FLAGS.checkpoint_every == 0:
path = saver.save(sess, checkpoint_prefix, global_step=current_step)
print("Saved model checkpoint to {}\n".format(path))
I edited some of the code for my own usage. However, every epoch is training too slow (2 mins per epoch) whereas, without editing, it takes less than 1 second per epode. I have a numpy array x_train of shape (209, 20000) and numpy array y_train of shape (209, 2). Could it be that my numpy array is too big?
What I changed from the code is using my own preprocessed data instead of data_helpers.py. I read in strings of size 20000 from 2 text files and put them into a list of length 209. Then, I load the data in train.py and change the list into a numpy array, and reshaping it after:
The function trainstep is the one that is taking too much time. The line being: , step, summaries, loss, accuracy = sess.run( [train_op, global_step, train_summary_op, cnn.loss, cnn.accuracy], feed_dict)