Open etyhh opened 3 years ago
I also met the same problem. Have you found a solution?
I also met the same problem. Have you found a solution?
I found in the below code: loss = rnnt_loss(y_pred, y_true, spec_lengths, label_lengths) y_pred is 'tensorflow.python.framework.ops.Tensor' change rnn to dnn and y_pred became ''tensorflow.python.framework.ops.EagerTensor' and Segmentation fault disappear. I'm working on using rnn and get EagerTensor
@etyhh What version of TensorFlow are you using?
@etyhh What version of TensorFlow are you using?
tensorflow-gpu==2.2.0
When I set print(tf.executing_eagerly()) before loss = rnnt_loss(y_pred, y_true, spec_lengths, label_lengths), got False, that is to say, the eager mode changed in the loss function.
When I set print(tf.executing_eagerly()) before loss = rnnt_loss(y_pred, y_true, spec_lengths, label_lengths), got False, that is to say, the eager mode changed in the loss function.
I tried add tf.config.experimental_run_functions_eagerly(True) at the begin of run_rnnt.py and loss.py. before loss = rnnt_loss() , print(tf.executing_eagerly()) return True but print(type(y_pred)) return 'tensorflow.python.framework.ops.Tensor'
I encountered same error as you,and i assumed the err error is generated from rnnt_loss, i have try some ways ,but it didn't work,anyone has fixed it?
change tf.compat.v1.nn.rnn_cell.LSTMCell to tf.keras.layers.LSTMCell works for me But tf.keras.layers.LSTMCell doesn't support projection
Hi, Error log as below:
Starting training. Performing evaluation. loss Tensor("transducer/dense_1/BiasAdd:0", shape=(None, None, None, 3971), dtype=float32, device=/job:localhost/replica:0/task:0/device:GPU:0) Tensor("dist_inputs_4:0", shape=(None, None), dtype=int32) Tensor("Cast:0", shape=(None,), dtype=int32, device=/job:localhost/replica:0/task:0/device:GPU:0) Tensor("dist_inputs_3:0", shape=(None,), dtype=int32) Fatal Python error: Segmentation fault
Thread 0x00007f6989132700 (most recent call first): File "/usr/lib64/python3.6/threading.py", line 295 in wait File "/usr/lib64/python3.6/threading.py", line 551 in wait File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/distribute/mirrored_strategy.py", line 978 in run File "/usr/lib64/python3.6/threading.py", line 916 in _bootstrap_inner File "/usr/lib64/python3.6/threading.py", line 884 in _bootstrap
Current thread 0x00007f6989933700 (most recent call first): File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1654 in _create_c_op File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1817 in init File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3327 in _create_op_internal File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/framework/func_graph.py", line 595 in _create_op_internal File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 744 in _apply_op_helper File "", line 81 in warp_rnnt
File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/autograph/impl/api.py", line 348 in _call_unconverted
File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/autograph/impl/api.py", line 534 in converted_call
File "/tmp/tmp7wdbpl1g.py", line 11 in tf__rnnt_loss
File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/autograph/impl/api.py", line 587 in converted_call
File "/tmp/tmpbbrruc7p.py", line 30 in tf___loss_fn
File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/autograph/impl/api.py", line 585 in converted_call
File "/tmp/tmp5y46mg16.py", line 25 in step_fn
File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/distribute/mirrored_strategy.py", line 998 in run
File "/usr/lib64/python3.6/threading.py", line 916 in _bootstrap_inner
File "/usr/lib64/python3.6/threading.py", line 884 in _bootstrap
Thread 0x00007f6d24153740 (most recent call first): File "/usr/lib64/python3.6/threading.py", line 295 in wait File "/usr/lib64/python3.6/threading.py", line 551 in wait File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/distribute/mirrored_strategy.py", line 165 in _call_for_each_replica File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/distribute/mirrored_strategy.py", line 770 in _call_for_each_replica File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/distribute/distribute_lib.py", line 2290 in call_for_each_replica File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/distribute/distribute_lib.py", line 951 in run File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/autograph/impl/api.py", line 346 in _call_unconverted File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/autograph/impl/api.py", line 492 in converted_call File "/tmp/tmp5y46mg16.py", line 66 in tf__eval_step File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/autograph/impl/api.py", line 585 in converted_call File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/framework/func_graph.py", line 964 in wrapper File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 441 in wrapped_fn File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/framework/func_graph.py", line 981 in func_graph_from_py_func File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2667 in _create_graph_function File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2777 in _maybe_define_function File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2446 in _get_concrete_function_internal_garbage_collected File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 506 in _initialize File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 627 in _call File "/home/zhangqin/.local/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 580 in call File "run_rnnt.py", line 434 in run_evaluate File "run_rnnt.py", line 312 in checkpoint_model File "run_rnnt.py", line 347 in run_training File "run_rnnt.py", line 547 in main File "/home/zhangqin/.local/lib/python3.6/site-packages/absl/app.py", line 251 in _run_main File "/home/zhangqin/.local/lib/python3.6/site-packages/absl/app.py", line 300 in run File "run_rnnt.py", line 588 in
Segmentation fault (core dumped)
the code which caused Segmentation fault print(y_pred, y_true, spec_lengths, label_lengths) loss = rnnt_loss(y_pred, y_true, spec_lengths, label_lengths) print('l f')
Thanks