Open sudhaa123 opened 6 years ago
I had this problem too when my training data increase over 100k and I fix it like this: train.py file
# Generate batches
batches_train = data_helpers.batch_iter(
list(zip(x_train, y_train)), FLAGS.batch_size, FLAGS.num_epochs)
# Training loop. For each batch_train...
for batch_train in batches_train:
x_batch_train, y_batch_train = zip(*batch_train)
train_step(x_batch_train, y_batch_train)
current_step = tf.train.global_step(sess, global_step)
if current_step % FLAGS.evaluate_every == 0:
batches_dev = data_helpers.batch_iter(
list(zip(x_dev, y_dev)), FLAGS.batch_size, 1)
print("\nEvaluation: ")
# Training loop. For each batch_dev...
for batch_dev in batches_dev:
x_batch_dev, y_batch_dev = zip(*batch_dev)
dev_step(x_batch_dev, y_batch_dev, writer=dev_summary_writer)
print("")
if current_step % FLAGS.checkpoint_every == 0:
path = saver.save(sess, checkpoint_prefix, global_step=current_step)
print("Saved model checkpoint to {}\n".format(path))
So what I did is to run the dev/test on batches too.
Hi @dennybritz ,
I want to give the training and testing in 70:30 ratio.
But, when I change the value "tf.flags.DEFINE_float("dev_sample_percentage", .1, "Percentage of the training data to use for validation") as 0.3.... This hits the error..
It executes smoothly till 50 runs, but finally showing "python stopped working" and not writing anything on the folder checkpoints..
Could you please help me to fix the issue.
thanks.