Unable to Fire Openseq2seq Jasper training

Hi all,

I have been trying to fire a training on openseq2seq and i see that the training doesn't start.

$ uname -a
Linux shaktimaan 4.18.0-24-generic #25~18.04.1-Ubuntu SMP Thu Jun 20 11:13:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

I have installed the drivers freshly and i see that the training doesnt start at all.

It is stuck here for almost 12 hrs

Colocation members, user-requested devices, and framework assigned devices, if any:
  ForwardPass/fully_connected_ctc_decoder/fully_connected/bias/Initializer/zeros (Const)
  ForwardPass/fully_connected_ctc_decoder/fully_connected/bias (VariableV2) /device:GPU:0
  ForwardPass/fully_connected_ctc_decoder/fully_connected/bias/Assign (Assign) /device:GPU:0
  ForwardPass/fully_connected_ctc_decoder/fully_connected/bias/read (Identity) /device:GPU:0
  Loss_Optimization/gradients/AddN (AddN) /device:GPU:0
  Loss_Optimization/FP32-master-copy/IsVariableInitialized_109 (IsVariableInitialized) /device:GPU:0
  Loss_Optimization/FP32-master-copy/cond_109/read/Switch (RefSwitch) /device:GPU:0
  Loss_Optimization/FP32-master-copy/cond_109/Switch_1 (Switch)
  Loss_Optimization/FP32-master-copy/ForwardPass/fully_connected_ctc_decoder/fully_connected/bias/IsVariableInitialized (IsVariableInitialized) /device:GPU:0
  Loss_Optimization/FP32-master-copy/ForwardPass/fully_connected_ctc_decoder/fully_connected/bias/cond/read/Switch (RefSwitch) /device:GPU:0
  Loss_Optimization/FP32-master-copy/ForwardPass/fully_connected_ctc_decoder/fully_connected/bias/cond/Switch_1 (Switch)
  Loss_Optimization/FP32-master-copy/cond_109/read/Switch_Loss_Optimization/FP32-master-copy/ForwardPass/fully_connected_ctc_decoder/fully_connected/bias (Switch)
  Loss_Optimization/cond_1/Assign_109/Switch (RefSwitch) /device:GPU:0
  Loss_Optimization/cond_1/Assign_109 (Assign) /device:GPU:0
  save/Assign (Assign) /device:GPU:0
  save_1/Assign (Assign) /device:GPU:0
  report_uninitialized_variables/IsVariableInitialized_541 (IsVariableInitialized) /device:GPU:0
  report_uninitialized_variables_1/IsVariableInitialized_541 (IsVariableInitialized) /device:GPU:0
  save_2/Assign_550 (Assign) /device:GPU:0

2019-09-05 20:24:08.016954: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set.  If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU.  To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
WARNING:tensorflow:From /home/vz/miniconda3/envs/gp_0_1/lib/python3.6/site-packages/tensorflow/python/training/saver.py:1066: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file utilities to get mtimes.

*** Running evaluation on a validation set:

Can anyone help me in understanding the issue

when i fire other trainings i see that the Gpu is being utilized but donno why it is not working

with Tensorflow or Openseq2seq

I followed all the steps in installations instructions.

Thank you.

Environment

$ pip freeze | grep tensorflow
tensorflow-estimator==1.14.0
tensorflow-gpu==1.14.0

NVIDIA / OpenSeq2Seq

Unable to Fire Openseq2seq Jasper training #496