luheng / lsgn

Labeled Span Graph Networks
Apache License 2.0
118 stars 27 forks source link

training always jam at session.run() #7

Open scofield7419 opened 5 years ago

scofield7419 commented 5 years ago

I tried hundreds times of training the model with singleton.py, the processing always suspend at: tf_loss, tf_global_step, _ = session.run([model.loss, model.global_step, model.train_op])

image you see from the snapshot that the codes is ok to run but, is suspended at there.

After several debug, I will say the problem is located on the: enqueue_thread = threading.Thread(target=_enqueue_loop) since the process never get into the '_enqueue_loop()'.

Other word, the 'FIFOQueue' with 'thread' failed to schedule.

Help with this plz.

scofield7419 commented 5 years ago

Solved! queue = tf.PaddingFIFOQueue(capacity=2, dtypes=dtypes, shapes=new_shapes) very large capacity causes the bug...