noahchalifour / rnnt-speech-recognition

End-to-end speech recognition using RNN Transducers in Tensorflow 2.0
MIT License
242 stars 79 forks source link

NaN loss #26

Closed dyc3 closed 4 years ago

dyc3 commented 4 years ago

Training with this command

python run_common_voice.py --mode train --data_dir data_p --batch_size 4 --steps_per_checkpoint 100 --eval_size 100

Started giving me NaN for loss in epoch 0, batch 156:

Epoch: 0, Batch: 151, Global Step: 151, Step Time: 1.8173, Loss: -737867.4375
WARNING: Forward backward likelihood mismatch 0.500000
WARNING: Forward backward likelihood mismatch 1.500000
Epoch: 0, Batch: 152, Global Step: 152, Step Time: 1.6645, Loss: -765632.8125
WARNING: Forward backward likelihood mismatch 1.000000
WARNING: Forward backward likelihood mismatch 1.000000
Epoch: 0, Batch: 153, Global Step: 153, Step Time: 2.0563, Loss: -803095.3750
Epoch: 0, Batch: 154, Global Step: 154, Step Time: 1.8974, Loss: -299315232768.0000
Epoch: 0, Batch: 155, Global Step: 155, Step Time: 1.8908, Loss: -763515174912.0000
Epoch: 0, Batch: 156, Global Step: 156, Step Time: 1.6551, Loss: -1258791763968.0000
Epoch: 0, Batch: 157, Global Step: 157, Step Time: 1.9214, Loss: nan
Epoch: 0, Batch: 158, Global Step: 158, Step Time: 1.0894, Loss: nan
Epoch: 0, Batch: 159, Global Step: 159, Step Time: 1.7952, Loss: nan
Epoch: 0, Batch: 160, Global Step: 160, Step Time: 2.0707, Loss: nan
Epoch: 0, Batch: 161, Global Step: 161, Step Time: 2.0600, Loss: nan
Epoch: 0, Batch: 162, Global Step: 162, Step Time: 1.7275, Loss: nan

I'm not really sure what's going on, but could it be related to these warnings?

WARNING: Forward backward likelihood mismatch 0.187500
WARNING: Forward backward likelihood mismatch 0.187500
WARNING: Forward backward likelihood mismatch 0.125000
WARNING: Forward backward likelihood mismatch 0.375000
noahchalifour commented 4 years ago

@dyc3 This is most likely a problem with the warp-transducer module getting built with without GPU support but using GPU when training. Please make sure when you are compiling the warp-transducer code that in the logs you see a flag like GPU_ENABLED (I don't remember what the exact flag looks like)

dyc3 commented 4 years ago

I tried rebuilding warp-transducer and training again, still getting the same problem. Is loss supposed to be negative like that?

Output of CUDA_HOME=/usr/local/cuda ./scripts/build_rnnt.sh

-- cuda found TRUE
-- Building shared library with GPU support
-- Configuring done
-- Generating done
-- Build files have been written to: /home/carson/Documents/code/other/rnnt-speech-recognition/warp-transducer/build
[ 14%] Built target warprnnt
[ 35%] Built target test_time_gpu
[ 57%] Built target test_gpu
[ 78%] Built target test_cpu
[100%] Built target test_time
2020-05-14 17:18:14.815321: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory
2020-05-14 17:18:14.815401: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory
2020-05-14 17:18:14.815412: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
setup.py:63: UserWarning: Assuming tensorflow was compiled without C++11 ABI. It is generally true if you are using binary pip package. If you compiled tensorflow from source with gcc >= 5 and didn't set -D_GLIBCXX_USE_CXX11_ABI=0 during compilation, you need to set environment variable TF_CXX11_ABI=1 when compiling this bindings. Also be sure to touch some files in src to trigger recompilation. Also, you need to set (or unsed) this environment variable if getting undefined symbol: _ZN10tensorflow... errors
  warnings.warn("Assuming tensorflow was compiled without C++11 ABI. "
running install
running bdist_egg
running egg_info
writing warprnnt_tensorflow.egg-info/PKG-INFO
writing dependency_links to warprnnt_tensorflow.egg-info/dependency_links.txt
writing top-level names to warprnnt_tensorflow.egg-info/top_level.txt
reading manifest file 'warprnnt_tensorflow.egg-info/SOURCES.txt'
writing manifest file 'warprnnt_tensorflow.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
creating build/bdist.linux-x86_64/egg
creating build/bdist.linux-x86_64/egg/warprnnt_tensorflow
copying build/lib.linux-x86_64-3.6/warprnnt_tensorflow/__init__.py -> build/bdist.linux-x86_64/egg/warprnnt_tensorflow
copying build/lib.linux-x86_64-3.6/warprnnt_tensorflow/kernels.cpython-36m-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/egg/warprnnt_tensorflow
byte-compiling build/bdist.linux-x86_64/egg/warprnnt_tensorflow/__init__.py to __init__.cpython-36.pyc
creating stub loader for warprnnt_tensorflow/kernels.cpython-36m-x86_64-linux-gnu.so
byte-compiling build/bdist.linux-x86_64/egg/warprnnt_tensorflow/kernels.py to kernels.cpython-36.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying warprnnt_tensorflow.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying warprnnt_tensorflow.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying warprnnt_tensorflow.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying warprnnt_tensorflow.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt
zip_safe flag not set; analyzing archive contents...
warprnnt_tensorflow.__pycache__.__init__.cpython-36: module references __path__
warprnnt_tensorflow.__pycache__.kernels.cpython-36: module references __file__
creating 'dist/warprnnt_tensorflow-0.1-py3.6-linux-x86_64.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing warprnnt_tensorflow-0.1-py3.6-linux-x86_64.egg
removing '/home/carson/Documents/code/other/rnnt-speech-recognition/.env/lib/python3.6/site-packages/warprnnt_tensorflow-0.1-py3.6-linux-x86_64.egg' (and everything under it)
creating /home/carson/Documents/code/other/rnnt-speech-recognition/.env/lib/python3.6/site-packages/warprnnt_tensorflow-0.1-py3.6-linux-x86_64.egg
Extracting warprnnt_tensorflow-0.1-py3.6-linux-x86_64.egg to /home/carson/Documents/code/other/rnnt-speech-recognition/.env/lib/python3.6/site-packages
warprnnt-tensorflow 0.1 is already the active version in easy-install.pth

Installed /home/carson/Documents/code/other/rnnt-speech-recognition/.env/lib/python3.6/site-packages/warprnnt_tensorflow-0.1-py3.6-linux-x86_64.egg
Processing dependencies for warprnnt-tensorflow==0.1
Finished processing dependencies for warprnnt-tensorflow==0.1

Output of python run_common_voice.py --mode train --data_dir data_p --batch_size 4 --steps_per_checkpoint 100 --eval_size 100

2020-05-14 17:20:40.320626: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory
2020-05-14 17:20:40.320683: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory
2020-05-14 17:20:40.320692: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
2020-05-14 17:20:42.191102: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-05-14 17:20:42.234245: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-05-14 17:20:42.273638: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-14 17:20:42.274507: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:2d:00.0 name: GeForce RTX 2080 SUPER computeCapability: 7.5
coreClock: 1.845GHz coreCount: 48 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 462.00GiB/s
2020-05-14 17:20:42.274535: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-05-14 17:20:42.275972: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-05-14 17:20:42.305621: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-05-14 17:20:42.312086: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-05-14 17:20:42.368172: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-05-14 17:20:42.374235: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-05-14 17:20:42.377479: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-05-14 17:20:42.377619: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-14 17:20:42.379273: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-14 17:20:42.379680: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-05-14 17:20:42.407546: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-05-14 17:20:42.569885: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3800165000 Hz
2020-05-14 17:20:42.584633: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x607e260 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-05-14 17:20:42.584669: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-05-14 17:20:42.660175: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-14 17:20:42.660546: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x60e3b60 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-05-14 17:20:42.660566: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce RTX 2080 SUPER, Compute Capability 7.5
2020-05-14 17:20:42.660730: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-14 17:20:42.661165: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:2d:00.0 name: GeForce RTX 2080 SUPER computeCapability: 7.5
coreClock: 1.845GHz coreCount: 48 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 462.00GiB/s
2020-05-14 17:20:42.661204: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-05-14 17:20:42.661219: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-05-14 17:20:42.661233: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-05-14 17:20:42.661247: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-05-14 17:20:42.661261: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-05-14 17:20:42.661274: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-05-14 17:20:42.661287: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-05-14 17:20:42.661348: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-14 17:20:42.661949: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-14 17:20:42.662348: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-05-14 17:20:42.665408: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-14 17:20:42.665427: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0
2020-05-14 17:20:42.665436: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N
2020-05-14 17:20:42.683597: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-14 17:20:42.684998: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-14 17:20:42.685477: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4342 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 SUPER, pci bus id: 0000:2d:00.0, compute capability: 7.5)
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',)
I0514 17:20:42.798862 139620787164992 mirrored_strategy.py:435] Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',)
WARNING:tensorflow:From /home/carson/Documents/code/other/rnnt-speech-recognition/model.py:57: LSTMCell.__init__ (from tensorflow.python.ops.rnn_cell_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This class is equivalent as tf.keras.layers.LSTMCell, and will be replaced by that in Tensorflow 2.0.
W0514 17:20:42.840057 139620787164992 deprecation.py:323] From /home/carson/Documents/code/other/rnnt-speech-recognition/model.py:57: LSTMCell.__init__ (from tensorflow.python.ops.rnn_cell_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This class is equivalent as tf.keras.layers.LSTMCell, and will be replaced by that in Tensorflow 2.0.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb680f84e0>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
W0514 17:20:42.840408 139620787164992 rnn_cell_impl.py:904] <tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb680f84e0>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
WARNING:tensorflow:From /home/carson/Documents/code/other/rnnt-speech-recognition/.env/lib/python3.6/site-packages/tensorflow_core/python/ops/rnn_cell_impl.py:958: Layer.add_variable (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.add_weight` method instead.
W0514 17:20:42.865542 139620787164992 deprecation.py:323] From /home/carson/Documents/code/other/rnnt-speech-recognition/.env/lib/python3.6/site-packages/tensorflow_core/python/ops/rnn_cell_impl.py:958: Layer.add_variable (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.add_weight` method instead.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb680f8a58>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
W0514 17:20:49.188249 139620787164992 rnn_cell_impl.py:904] <tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb680f8a58>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb680575c0>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
W0514 17:20:49.349238 139620787164992 rnn_cell_impl.py:904] <tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb680575c0>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb6054cdd8>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
W0514 17:20:49.397319 139620787164992 rnn_cell_impl.py:904] <tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb6054cdd8>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb60534630>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
W0514 17:20:49.491118 139620787164992 rnn_cell_impl.py:904] <tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb60534630>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb6058cdd8>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
W0514 17:20:49.535753 139620787164992 rnn_cell_impl.py:904] <tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb6058cdd8>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb6046b278>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
W0514 17:20:49.579947 139620787164992 rnn_cell_impl.py:904] <tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb6046b278>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb603d3828>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
W0514 17:20:49.624474 139620787164992 rnn_cell_impl.py:904] <tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb603d3828>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb6032a8d0>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
W0514 17:20:49.976295 139620787164992 rnn_cell_impl.py:904] <tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb6032a8d0>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb14097320>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
W0514 17:20:50.025335 139620787164992 rnn_cell_impl.py:904] <tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb14097320>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
I0514 17:20:50.346390 139620787164992 run_common_voice.py:440] Using word-piece encoder with vocab size: 4088
Model: "encoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_1 (InputLayer)         [(None, None, 80)]        0
_________________________________________________________________
rnn (RNN)                    (None, None, 640)         7217152
_________________________________________________________________
layer_normalization (LayerNo (None, None, 640)         1280
_________________________________________________________________
rnn_1 (RNN)                  (None, None, 640)         11804672
_________________________________________________________________
layer_normalization_1 (Layer (None, None, 640)         1280
_________________________________________________________________
time_reduction (TimeReductio (None, None, 1280)        0
_________________________________________________________________
rnn_2 (RNN)                  (None, None, 640)         17047552
_________________________________________________________________
layer_normalization_2 (Layer (None, None, 640)         1280
_________________________________________________________________
rnn_3 (RNN)                  (None, None, 640)         11804672
_________________________________________________________________
layer_normalization_3 (Layer (None, None, 640)         1280
_________________________________________________________________
rnn_4 (RNN)                  (None, None, 640)         11804672
_________________________________________________________________
layer_normalization_4 (Layer (None, None, 640)         1280
_________________________________________________________________
rnn_5 (RNN)                  (None, None, 640)         11804672
_________________________________________________________________
layer_normalization_5 (Layer (None, None, 640)         1280
_________________________________________________________________
rnn_6 (RNN)                  (None, None, 640)         11804672
_________________________________________________________________
layer_normalization_6 (Layer (None, None, 640)         1280
_________________________________________________________________
rnn_7 (RNN)                  (None, None, 640)         11804672
_________________________________________________________________
layer_normalization_7 (Layer (None, None, 640)         1280
=================================================================
Total params: 95,102,976
Trainable params: 95,102,976
Non-trainable params: 0
_________________________________________________________________
Model: "prediction_network"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_2 (InputLayer)         [(None, None)]            0
_________________________________________________________________
embedding (Embedding)        (None, None, 384)         1569792
_________________________________________________________________
rnn_8 (RNN)                  (None, None, 640)         9707520
_________________________________________________________________
layer_normalization_8 (Layer (None, None, 640)         1280
_________________________________________________________________
rnn_9 (RNN)                  (None, None, 640)         11804672
_________________________________________________________________
layer_normalization_9 (Layer (None, None, 640)         1280
=================================================================
Total params: 23,084,544
Trainable params: 23,084,544
Non-trainable params: 0
_________________________________________________________________
Model: "transducer"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
mel_specs (InputLayer)          [(None, None, 80)]   0
__________________________________________________________________________________________________
pred_inp (InputLayer)           [(None, None)]       0
__________________________________________________________________________________________________
encoder (Model)                 (None, None, 640)    95102976    mel_specs[0][0]
__________________________________________________________________________________________________
prediction_network (Model)      (None, None, 640)    23084544    pred_inp[0][0]
__________________________________________________________________________________________________
tf_op_layer_Shape (TensorFlowOp [(3,)]               0           prediction_network[1][0]
__________________________________________________________________________________________________
tf_op_layer_Shape_1 (TensorFlow [(3,)]               0           encoder[1][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice (Tens [()]                 0           tf_op_layer_Shape[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice_1 (Te [()]                 0           tf_op_layer_Shape_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_ExpandDims (TensorF [(None, None, 1, 640 0           encoder[1][0]
__________________________________________________________________________________________________
tf_op_layer_stack_inp_enc (Tens [(4,)]               0           tf_op_layer_strided_slice[0][0]
__________________________________________________________________________________________________
tf_op_layer_ExpandDims_1 (Tenso [(None, 1, None, 640 0           prediction_network[1][0]
__________________________________________________________________________________________________
tf_op_layer_stack_pred_out (Ten [(4,)]               0           tf_op_layer_strided_slice_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_Tile (TensorFlowOpL [(None, None, None,  0           tf_op_layer_ExpandDims[0][0]
                                                                 tf_op_layer_stack_inp_enc[0][0]
__________________________________________________________________________________________________
tf_op_layer_Tile_1 (TensorFlowO [(None, None, None,  0           tf_op_layer_ExpandDims_1[0][0]
                                                                 tf_op_layer_stack_pred_out[0][0]
__________________________________________________________________________________________________
concatenate (Concatenate)       (None, None, None, 1 0           tf_op_layer_Tile[0][0]
                                                                 tf_op_layer_Tile_1[0][0]
__________________________________________________________________________________________________
dense (Dense)                   (None, None, None, 6 819840      concatenate[0][0]
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, None, None, 4 2620408     dense[0][0]
==================================================================================================
Total params: 121,627,768
Trainable params: 121,627,768
Non-trainable params: 0
__________________________________________________________________________________________________
Starting training.
2020-05-14 17:20:59.359055: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-05-14 17:20:59.689126: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
W0514 17:21:02.873392 139620787164992 mirrored_strategy.py:692] Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
I0514 17:21:02.877112 139620787164992 cross_device_ops.py:439] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
I0514 17:21:02.878273 139620787164992 cross_device_ops.py:439] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
I0514 17:21:02.900741 139620787164992 cross_device_ops.py:439] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
I0514 17:21:02.901246 139620787164992 cross_device_ops.py:439] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
Epoch: 0, Batch: 0, Global Step: 0, Step Time: 12.0405, Loss: -74.3107
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
I0514 17:21:02.902410 139620787164992 cross_device_ops.py:439] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
I0514 17:21:02.902935 139620787164992 cross_device_ops.py:439] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
W0514 17:21:09.756167 139620787164992 mirrored_strategy.py:692] Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
I0514 17:21:09.758974 139620787164992 cross_device_ops.py:439] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
I0514 17:21:09.759661 139620787164992 cross_device_ops.py:439] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
I0514 17:21:09.760767 139620787164992 cross_device_ops.py:439] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
I0514 17:21:09.761307 139620787164992 cross_device_ops.py:439] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
Epoch: 0, Batch: 1, Global Step: 1, Step Time: 6.8481, Loss: -384.7598
WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
W0514 17:21:17.034537 139620787164992 mirrored_strategy.py:692] Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
Epoch: 0, Batch: 2, Global Step: 2, Step Time: 7.2676, Loss: -706.7275
WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
W0514 17:21:18.881206 139620787164992 mirrored_strategy.py:692] Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
Epoch: 0, Batch: 3, Global Step: 3, Step Time: 1.8337, Loss: -1171.1047
WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
W0514 17:21:21.134955 139620787164992 mirrored_strategy.py:692] Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
Epoch: 0, Batch: 4, Global Step: 4, Step Time: 2.2433, Loss: -1829.6791
Epoch: 0, Batch: 5, Global Step: 5, Step Time: 2.0327, Loss: -2597.6094
Epoch: 0, Batch: 6, Global Step: 6, Step Time: 1.9921, Loss: -3664.4426
Epoch: 0, Batch: 7, Global Step: 7, Step Time: 1.6810, Loss: -4379.8940
Epoch: 0, Batch: 8, Global Step: 8, Step Time: 1.7100, Loss: -5485.8438
Epoch: 0, Batch: 9, Global Step: 9, Step Time: 1.5717, Loss: -6664.4048
Epoch: 0, Batch: 10, Global Step: 10, Step Time: 1.7171, Loss: -8078.4673
Epoch: 0, Batch: 11, Global Step: 11, Step Time: 1.2996, Loss: -9137.0742
Epoch: 0, Batch: 12, Global Step: 12, Step Time: 1.8903, Loss: -11215.8027
Epoch: 0, Batch: 13, Global Step: 13, Step Time: 1.7695, Loss: -13523.7441
Epoch: 0, Batch: 14, Global Step: 14, Step Time: 1.0525, Loss: -14449.4160
Epoch: 0, Batch: 15, Global Step: 15, Step Time: 1.9671, Loss: -16717.8164
Epoch: 0, Batch: 16, Global Step: 16, Step Time: 1.6840, Loss: -18992.6074
Epoch: 0, Batch: 17, Global Step: 17, Step Time: 1.4586, Loss: -20612.7383
Epoch: 0, Batch: 18, Global Step: 18, Step Time: 1.1514, Loss: -22073.9102
Epoch: 0, Batch: 19, Global Step: 19, Step Time: 1.0293, Loss: -23262.2422
Epoch: 0, Batch: 20, Global Step: 20, Step Time: 1.1142, Loss: -24740.1406
Epoch: 0, Batch: 21, Global Step: 21, Step Time: 1.1537, Loss: -26559.7246
Epoch: 0, Batch: 22, Global Step: 22, Step Time: 1.5443, Loss: -28882.6094
Epoch: 0, Batch: 23, Global Step: 23, Step Time: 1.4967, Loss: -31348.6172
Epoch: 0, Batch: 24, Global Step: 24, Step Time: 1.7497, Loss: -34089.2578
Epoch: 0, Batch: 25, Global Step: 25, Step Time: 1.6188, Loss: -36955.6016
Epoch: 0, Batch: 26, Global Step: 26, Step Time: 1.9737, Loss: -40000.2422
Epoch: 0, Batch: 27, Global Step: 27, Step Time: 1.3382, Loss: -42761.7500
Epoch: 0, Batch: 28, Global Step: 28, Step Time: 1.4812, Loss: -45338.0820
Epoch: 0, Batch: 29, Global Step: 29, Step Time: 1.4887, Loss: -47364.6367

*TRUNCATED*

Epoch: 0, Batch: 166, Global Step: 166, Step Time: 1.3649, Loss: -636049.5000
WARNING: Forward backward likelihood mismatch 0.250000
WARNING: Forward backward likelihood mismatch 0.125000
WARNING: Forward backward likelihood mismatch 0.750000
WARNING: Forward backward likelihood mismatch 0.500000
Epoch: 0, Batch: 167, Global Step: 167, Step Time: 1.1865, Loss: -649422.8750
WARNING: Forward backward likelihood mismatch 0.500000
WARNING: Forward backward likelihood mismatch 0.500000
Epoch: 0, Batch: 168, Global Step: 168, Step Time: 1.4211, Loss: -666955.5000
WARNING: Forward backward likelihood mismatch 1.000000
WARNING: Forward backward likelihood mismatch 0.500000
WARNING: Forward backward likelihood mismatch 0.500000
WARNING: Forward backward likelihood mismatch 0.500000
Epoch: 0, Batch: 169, Global Step: 169, Step Time: 2.0053, Loss: -696473.4375
WARNING: Forward backward likelihood mismatch 8192.000000
Epoch: 0, Batch: 170, Global Step: 170, Step Time: 1.9966, Loss: -365572896.0000
Epoch: 0, Batch: 171, Global Step: 171, Step Time: 1.7759, Loss: nan
Epoch: 0, Batch: 172, Global Step: 172, Step Time: 1.4074, Loss: nan
Epoch: 0, Batch: 173, Global Step: 173, Step Time: 2.3243, Loss: nan
Epoch: 0, Batch: 174, Global Step: 174, Step Time: 1.5460, Loss: nan
Epoch: 0, Batch: 175, Global Step: 175, Step Time: 1.2894, Loss: nan
Epoch: 0, Batch: 176, Global Step: 176, Step Time: 1.4205, Loss: nan
Epoch: 0, Batch: 177, Global Step: 177, Step Time: 2.0478, Loss: nan
noahchalifour commented 4 years ago

@dyc3 Make sure you delete the build, dist and warprnnt_tensorflow.egg-info directories inside of warp-transducer/tensorflow_binding and then run it again and post the output here

dyc3 commented 4 years ago

I deleted those directories, and ran make clean in the build directory. It appears to be working correctly now. We should set up some assertions so other people don't run into these issues.

Output of CUDA_HOME=/usr/local/cuda ./scripts/build_rnnt.sh

-- cuda found TRUE
-- Building shared library with GPU support
-- Configuring done
-- Generating done
-- Build files have been written to: /home/carson/Documents/code/other/rnnt-speech-recognition/warp-transducer/build
[  7%] Building NVCC (Device) object CMakeFiles/warprnnt.dir/src/warprnnt_generated_rnnt_entrypoint.cu.o
[ 14%] Linking CXX shared library libwarprnnt.so
[ 14%] Built target warprnnt
[ 21%] Building NVCC (Device) object CMakeFiles/test_time_gpu.dir/tests/test_time_gpu_generated_test_time.cu.o
[ 28%] Building CXX object CMakeFiles/test_time_gpu.dir/tests/random.cpp.o
[ 35%] Linking CXX executable test_time_gpu
[ 35%] Built target test_time_gpu
[ 42%] Building NVCC (Device) object CMakeFiles/test_gpu.dir/tests/test_gpu_generated_test_gpu.cu.o
[ 50%] Building CXX object CMakeFiles/test_gpu.dir/tests/random.cpp.o
[ 57%] Linking CXX executable test_gpu
[ 57%] Built target test_gpu
[ 64%] Building CXX object CMakeFiles/test_cpu.dir/tests/test_cpu.cpp.o
[ 71%] Building CXX object CMakeFiles/test_cpu.dir/tests/random.cpp.o
[ 78%] Linking CXX executable test_cpu
[ 78%] Built target test_cpu
[ 85%] Building CXX object CMakeFiles/test_time.dir/tests/test_time.cpp.o
[ 92%] Building CXX object CMakeFiles/test_time.dir/tests/random.cpp.o
[100%] Linking CXX executable test_time
[100%] Built target test_time
2020-05-15 15:29:50.436150: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory
2020-05-15 15:29:50.436233: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory
2020-05-15 15:29:50.436245: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
setup.py:63: UserWarning: Assuming tensorflow was compiled without C++11 ABI. It is generally true if you are using binary pip package. If you compiled tensorflow from source with gcc >= 5 and didn't set -D_GLIBCXX_USE_CXX11_ABI=0 during compilation, you need to set environment variable TF_CXX11_ABI=1 when compiling this bindings. Also be sure to touch some files in src to trigger recompilation. Also, you need to set (or unsed) this environment variable if getting undefined symbol: _ZN10tensorflow... errors
  warnings.warn("Assuming tensorflow was compiled without C++11 ABI. "
running install
running bdist_egg
running egg_info
creating warprnnt_tensorflow.egg-info
writing warprnnt_tensorflow.egg-info/PKG-INFO
writing dependency_links to warprnnt_tensorflow.egg-info/dependency_links.txt
writing top-level names to warprnnt_tensorflow.egg-info/top_level.txt
writing manifest file 'warprnnt_tensorflow.egg-info/SOURCES.txt'
reading manifest file 'warprnnt_tensorflow.egg-info/SOURCES.txt'
writing manifest file 'warprnnt_tensorflow.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
creating build
creating build/lib.linux-x86_64-3.6
creating build/lib.linux-x86_64-3.6/warprnnt_tensorflow
copying warprnnt_tensorflow/__init__.py -> build/lib.linux-x86_64-3.6/warprnnt_tensorflow
running build_ext
building 'warprnnt_tensorflow.kernels' extension
creating build/temp.linux-x86_64-3.6
creating build/temp.linux-x86_64-3.6/src
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/carson/Documents/code/other/rnnt-speech-recognition/.env/lib/python3.6/site-packages/tensorflow_core/include -I/home/carson/Documents/code/other/rnnt-speech-recognition/.env/lib/python3.6/site-packages/tensorflow_core -I/home/carson/Documents/code/other/rnnt-speech-recognition/warp-transducer/tensorflow_binding/../include -I/home/carson/Documents/code/other/rnnt-speech-recognition/.env/lib/python3.6/site-packages/tensorflow_core/include/external/nsync/public -I/usr/local/cuda/include -I/home/carson/Documents/code/other/rnnt-speech-recognition/warp-transducer/tensorflow_binding/include -I/usr/include/python3.6m -I/home/carson/Documents/code/other/rnnt-speech-recognition/.env/include/python3.6m -c src/warprnnt_op.cc -o build/temp.linux-x86_64-3.6/src/warprnnt_op.o -std=c++11 -fPIC -D_GLIBCXX_USE_CXX11_ABI=0 -Wno-return-type -I/home/carson/Documents/code/other/rnnt-speech-recognition/.env/lib/python3.6/site-packages/tensorflow_core/include -D_GLIBCXX_USE_CXX11_ABI=0 -DWARPRNNT_ENABLE_GPU
x86_64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.6/src/warprnnt_op.o -L../build -Wl,--enable-new-dtags,-R/home/carson/Documents/code/other/rnnt-speech-recognition/warp-transducer/build -lwarprnnt -o build/lib.linux-x86_64-3.6/warprnnt_tensorflow/kernels.cpython-36m-x86_64-linux-gnu.so -L/home/carson/Documents/code/other/rnnt-speech-recognition/.env/lib/python3.6/site-packages/tensorflow_core -l:libtensorflow_framework.so.2
creating build/bdist.linux-x86_64
creating build/bdist.linux-x86_64/egg
creating build/bdist.linux-x86_64/egg/warprnnt_tensorflow
copying build/lib.linux-x86_64-3.6/warprnnt_tensorflow/__init__.py -> build/bdist.linux-x86_64/egg/warprnnt_tensorflow
copying build/lib.linux-x86_64-3.6/warprnnt_tensorflow/kernels.cpython-36m-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/egg/warprnnt_tensorflow
byte-compiling build/bdist.linux-x86_64/egg/warprnnt_tensorflow/__init__.py to __init__.cpython-36.pyc
creating stub loader for warprnnt_tensorflow/kernels.cpython-36m-x86_64-linux-gnu.so
byte-compiling build/bdist.linux-x86_64/egg/warprnnt_tensorflow/kernels.py to kernels.cpython-36.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying warprnnt_tensorflow.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying warprnnt_tensorflow.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying warprnnt_tensorflow.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying warprnnt_tensorflow.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt
zip_safe flag not set; analyzing archive contents...
warprnnt_tensorflow.__pycache__.__init__.cpython-36: module references __path__
warprnnt_tensorflow.__pycache__.kernels.cpython-36: module references __file__
creating dist
creating 'dist/warprnnt_tensorflow-0.1-py3.6-linux-x86_64.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing warprnnt_tensorflow-0.1-py3.6-linux-x86_64.egg
removing '/home/carson/Documents/code/other/rnnt-speech-recognition/.env/lib/python3.6/site-packages/warprnnt_tensorflow-0.1-py3.6-linux-x86_64.egg' (and everything under it)
creating /home/carson/Documents/code/other/rnnt-speech-recognition/.env/lib/python3.6/site-packages/warprnnt_tensorflow-0.1-py3.6-linux-x86_64.egg
Extracting warprnnt_tensorflow-0.1-py3.6-linux-x86_64.egg to /home/carson/Documents/code/other/rnnt-speech-recognition/.env/lib/python3.6/site-packages
warprnnt-tensorflow 0.1 is already the active version in easy-install.pth

Installed /home/carson/Documents/code/other/rnnt-speech-recognition/.env/lib/python3.6/site-packages/warprnnt_tensorflow-0.1-py3.6-linux-x86_64.egg
Processing dependencies for warprnnt-tensorflow==0.1
Finished processing dependencies for warprnnt-tensorflow==0.1

Output of python run_common_voice.py --mode train --data_dir data_p --batch_size 4 --steps_per_checkpoint 100 --eval_size 100

2020-05-15 15:31:43.586998: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory
2020-05-15 15:31:43.587052: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory
2020-05-15 15:31:43.587062: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
2020-05-15 15:31:45.429342: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-05-15 15:31:45.460205: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-05-15 15:31:45.533350: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-15 15:31:45.533895: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:2d:00.0 name: GeForce RTX 2080 SUPER computeCapability: 7.5
coreClock: 1.845GHz coreCount: 48 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 462.00GiB/s
2020-05-15 15:31:45.533917: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-05-15 15:31:45.535389: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-05-15 15:31:45.566868: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-05-15 15:31:45.574169: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-05-15 15:31:45.631965: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-05-15 15:31:45.639441: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-05-15 15:31:45.642809: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-05-15 15:31:45.642951: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-15 15:31:45.643656: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-15 15:31:45.644182: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-05-15 15:31:45.656124: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-05-15 15:31:45.815729: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3800130000 Hz
2020-05-15 15:31:45.817206: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x607e440 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-05-15 15:31:45.817246: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-05-15 15:31:45.896164: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-15 15:31:45.896609: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x60e3d40 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-05-15 15:31:45.896653: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce RTX 2080 SUPER, Compute Capability 7.5
2020-05-15 15:31:45.896929: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-15 15:31:45.897430: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:2d:00.0 name: GeForce RTX 2080 SUPER computeCapability: 7.5
coreClock: 1.845GHz coreCount: 48 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 462.00GiB/s
2020-05-15 15:31:45.897471: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-05-15 15:31:45.897487: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-05-15 15:31:45.897502: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-05-15 15:31:45.897516: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-05-15 15:31:45.897530: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-05-15 15:31:45.897543: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-05-15 15:31:45.897557: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-05-15 15:31:45.897627: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-15 15:31:45.898153: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-15 15:31:45.898662: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-05-15 15:31:45.900880: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-15 15:31:45.900903: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0
2020-05-15 15:31:45.900915: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N
2020-05-15 15:31:45.909934: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-15 15:31:45.910372: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-15 15:31:45.911201: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3878 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 SUPER, pci bus id: 0000:2d:00.0, compute capability: 7.5)
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',)
I0515 15:31:45.991474 139620787164992 mirrored_strategy.py:435] Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',)
WARNING:tensorflow:From /home/carson/Documents/code/other/rnnt-speech-recognition/model.py:57: LSTMCell.__init__ (from tensorflow.python.ops.rnn_cell_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This class is equivalent as tf.keras.layers.LSTMCell, and will be replaced by that in Tensorflow 2.0.
W0515 15:31:46.026490 139620787164992 deprecation.py:323] From /home/carson/Documents/code/other/rnnt-speech-recognition/model.py:57: LSTMCell.__init__ (from tensorflow.python.ops.rnn_cell_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This class is equivalent as tf.keras.layers.LSTMCell, and will be replaced by that in Tensorflow 2.0.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb681784e0>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
W0515 15:31:46.026841 139620787164992 rnn_cell_impl.py:904] <tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb681784e0>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
WARNING:tensorflow:From /home/carson/Documents/code/other/rnnt-speech-recognition/.env/lib/python3.6/site-packages/tensorflow_core/python/ops/rnn_cell_impl.py:958: Layer.add_variable (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.add_weight` method instead.
W0515 15:31:46.046895 139620787164992 deprecation.py:323] From /home/carson/Documents/code/other/rnnt-speech-recognition/.env/lib/python3.6/site-packages/tensorflow_core/python/ops/rnn_cell_impl.py:958: Layer.add_variable (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.add_weight` method instead.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb68178a58>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
W0515 15:31:52.296550 139620787164992 rnn_cell_impl.py:904] <tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb68178a58>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb680d55c0>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
W0515 15:31:52.446605 139620787164992 rnn_cell_impl.py:904] <tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb680d55c0>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb605a5be0>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
W0515 15:31:52.491398 139620787164992 rnn_cell_impl.py:904] <tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb605a5be0>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb605af5f8>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
W0515 15:31:52.582461 139620787164992 rnn_cell_impl.py:904] <tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb605af5f8>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb605d99b0>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
W0515 15:31:52.624663 139620787164992 rnn_cell_impl.py:904] <tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb605d99b0>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb604eb278>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
W0515 15:31:52.666759 139620787164992 rnn_cell_impl.py:904] <tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb604eb278>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb60453828>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
W0515 15:31:52.709095 139620787164992 rnn_cell_impl.py:904] <tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb60453828>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb603ab8d0>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
W0515 15:31:53.044921 139620787164992 rnn_cell_impl.py:904] <tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb603ab8d0>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb60047320>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
W0515 15:31:53.096271 139620787164992 rnn_cell_impl.py:904] <tensorflow.python.ops.rnn_cell_impl.LSTMCell object at 0x7efb60047320>: Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU.
I0515 15:31:53.420779 139620787164992 run_common_voice.py:440] Using word-piece encoder with vocab size: 4088
Model: "encoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_1 (InputLayer)         [(None, None, 80)]        0
_________________________________________________________________
rnn (RNN)                    (None, None, 640)         7217152
_________________________________________________________________
layer_normalization (LayerNo (None, None, 640)         1280
_________________________________________________________________
rnn_1 (RNN)                  (None, None, 640)         11804672
_________________________________________________________________
layer_normalization_1 (Layer (None, None, 640)         1280
_________________________________________________________________
time_reduction (TimeReductio (None, None, 1280)        0
_________________________________________________________________
rnn_2 (RNN)                  (None, None, 640)         17047552
_________________________________________________________________
layer_normalization_2 (Layer (None, None, 640)         1280
_________________________________________________________________
rnn_3 (RNN)                  (None, None, 640)         11804672
_________________________________________________________________
layer_normalization_3 (Layer (None, None, 640)         1280
_________________________________________________________________
rnn_4 (RNN)                  (None, None, 640)         11804672
_________________________________________________________________
layer_normalization_4 (Layer (None, None, 640)         1280
_________________________________________________________________
rnn_5 (RNN)                  (None, None, 640)         11804672
_________________________________________________________________
layer_normalization_5 (Layer (None, None, 640)         1280
_________________________________________________________________
rnn_6 (RNN)                  (None, None, 640)         11804672
_________________________________________________________________
layer_normalization_6 (Layer (None, None, 640)         1280
_________________________________________________________________
rnn_7 (RNN)                  (None, None, 640)         11804672
_________________________________________________________________
layer_normalization_7 (Layer (None, None, 640)         1280
=================================================================
Total params: 95,102,976
Trainable params: 95,102,976
Non-trainable params: 0
_________________________________________________________________
Model: "prediction_network"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_2 (InputLayer)         [(None, None)]            0
_________________________________________________________________
embedding (Embedding)        (None, None, 384)         1569792
_________________________________________________________________
rnn_8 (RNN)                  (None, None, 640)         9707520
_________________________________________________________________
layer_normalization_8 (Layer (None, None, 640)         1280
_________________________________________________________________
rnn_9 (RNN)                  (None, None, 640)         11804672
_________________________________________________________________
layer_normalization_9 (Layer (None, None, 640)         1280
=================================================================
Total params: 23,084,544
Trainable params: 23,084,544
Non-trainable params: 0
_________________________________________________________________
Model: "transducer"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
mel_specs (InputLayer)          [(None, None, 80)]   0
__________________________________________________________________________________________________
pred_inp (InputLayer)           [(None, None)]       0
__________________________________________________________________________________________________
encoder (Model)                 (None, None, 640)    95102976    mel_specs[0][0]
__________________________________________________________________________________________________
prediction_network (Model)      (None, None, 640)    23084544    pred_inp[0][0]
__________________________________________________________________________________________________
tf_op_layer_Shape (TensorFlowOp [(3,)]               0           prediction_network[1][0]
__________________________________________________________________________________________________
tf_op_layer_Shape_1 (TensorFlow [(3,)]               0           encoder[1][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice (Tens [()]                 0           tf_op_layer_Shape[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice_1 (Te [()]                 0           tf_op_layer_Shape_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_ExpandDims (TensorF [(None, None, 1, 640 0           encoder[1][0]
__________________________________________________________________________________________________
tf_op_layer_stack_inp_enc (Tens [(4,)]               0           tf_op_layer_strided_slice[0][0]
__________________________________________________________________________________________________
tf_op_layer_ExpandDims_1 (Tenso [(None, 1, None, 640 0           prediction_network[1][0]
__________________________________________________________________________________________________
tf_op_layer_stack_pred_out (Ten [(4,)]               0           tf_op_layer_strided_slice_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_Tile (TensorFlowOpL [(None, None, None,  0           tf_op_layer_ExpandDims[0][0]
                                                                 tf_op_layer_stack_inp_enc[0][0]
__________________________________________________________________________________________________
tf_op_layer_Tile_1 (TensorFlowO [(None, None, None,  0           tf_op_layer_ExpandDims_1[0][0]
                                                                 tf_op_layer_stack_pred_out[0][0]
__________________________________________________________________________________________________
concatenate (Concatenate)       (None, None, None, 1 0           tf_op_layer_Tile[0][0]
                                                                 tf_op_layer_Tile_1[0][0]
__________________________________________________________________________________________________
dense (Dense)                   (None, None, None, 6 819840      concatenate[0][0]
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, None, None, 4 2620408     dense[0][0]
==================================================================================================
Total params: 121,627,768
Trainable params: 121,627,768
Non-trainable params: 0
__________________________________________________________________________________________________
Starting training.
2020-05-15 15:32:02.296027: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-05-15 15:32:02.637153: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-05-15 15:32:04.440403: I tensorflow/stream_executor/cuda/cuda_driver.cc:801] failed to allocate 1.79G (1918828544 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2020-05-15 15:32:04.441010: I tensorflow/stream_executor/cuda/cuda_driver.cc:801] failed to allocate 1.61G (1726945792 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
W0515 15:32:05.760291 139620787164992 mirrored_strategy.py:692] Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
I0515 15:32:05.765215 139620787164992 cross_device_ops.py:439] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
I0515 15:32:05.766103 139620787164992 cross_device_ops.py:439] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
I0515 15:32:05.782319 139620787164992 cross_device_ops.py:439] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
I0515 15:32:05.783069 139620787164992 cross_device_ops.py:439] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
Epoch: 0, Batch: 0, Global Step: 0, Step Time: 11.8903, Loss: 879.1518
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
I0515 15:32:05.785687 139620787164992 cross_device_ops.py:439] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
I0515 15:32:05.786294 139620787164992 cross_device_ops.py:439] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
W0515 15:32:12.543942 139620787164992 mirrored_strategy.py:692] Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
I0515 15:32:12.546604 139620787164992 cross_device_ops.py:439] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
I0515 15:32:12.547358 139620787164992 cross_device_ops.py:439] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
I0515 15:32:12.551557 139620787164992 cross_device_ops.py:439] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
I0515 15:32:12.552463 139620787164992 cross_device_ops.py:439] Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).
Epoch: 0, Batch: 1, Global Step: 1, Step Time: 6.7504, Loss: 706.5315
WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
W0515 15:32:19.707470 139620787164992 mirrored_strategy.py:692] Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
Epoch: 0, Batch: 2, Global Step: 2, Step Time: 7.1468, Loss: 525.4813
WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
W0515 15:32:21.486869 139620787164992 mirrored_strategy.py:692] Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
Epoch: 0, Batch: 3, Global Step: 3, Step Time: 1.7605, Loss: 447.7578
WARNING:tensorflow:Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
W0515 15:32:23.308502 139620787164992 mirrored_strategy.py:692] Using MirroredStrategy eagerly has significant overhead currently. We will be working on improving this in the future, but for now please wrap `call_for_each_replica` or `experimental_run` or `experimental_run_v2` inside a tf.function to get the best performance.
Epoch: 0, Batch: 4, Global Step: 4, Step Time: 1.8101, Loss: 411.6488
Epoch: 0, Batch: 5, Global Step: 5, Step Time: 1.9582, Loss: 377.7440
Epoch: 0, Batch: 6, Global Step: 6, Step Time: 2.1310, Loss: 356.5850
Epoch: 0, Batch: 7, Global Step: 7, Step Time: 1.6160, Loss: 327.3580
Epoch: 0, Batch: 8, Global Step: 8, Step Time: 1.6567, Loss: 309.9092
Epoch: 0, Batch: 9, Global Step: 9, Step Time: 1.5911, Loss: 296.6859
Epoch: 0, Batch: 10, Global Step: 10, Step Time: 1.7814, Loss: 287.7508
Epoch: 0, Batch: 11, Global Step: 11, Step Time: 1.2450, Loss: 277.5056
Epoch: 0, Batch: 12, Global Step: 12, Step Time: 1.9164, Loss: 270.2157
Epoch: 0, Batch: 13, Global Step: 13, Step Time: 1.7441, Loss: 263.7001
Epoch: 0, Batch: 14, Global Step: 14, Step Time: 1.0338, Loss: 253.4502
Epoch: 0, Batch: 15, Global Step: 15, Step Time: 1.6319, Loss: 246.9501
Epoch: 0, Batch: 16, Global Step: 16, Step Time: 1.2650, Loss: 239.8391
Epoch: 0, Batch: 17, Global Step: 17, Step Time: 1.1824, Loss: 234.5128
Epoch: 0, Batch: 18, Global Step: 18, Step Time: 1.0047, Loss: 228.2357

*TRUNCATED*