kyzhouhzau / BERT-NER

Use Google's BERT for named entity recognition (CoNLL-2003 as the dataset).
MIT License
1.24k stars 335 forks source link

'Error recorded from outfeed: Session is closed.' Error with TPU #71

Open FallakAsad opened 5 years ago

FallakAsad commented 5 years ago

I am trying to fine tuning Bert on my custom dataset. Training works fine on GPU, however, I am unable to run the training on Google's colab with TPU. I see following error when running with use_tpu=true. My run the script with following params:

!python BERT_NER.py\ --task_name="NER" \ --do_lower_case=False \ --crf=True \ --do_train=True \ --do_eval=False \ --do_predict=True \ --data_dir=\ --do_lower_case=False \ --vocab_file=gs:///vocab.txt \ --bert_config_file=gs:///bert_config.json \ --init_checkpoint=gs:///bert_model.ckpt \ --max_seq_length=512 \ --train_batch_size=16 \ --learning_rate=2e-5 \ --num_train_epochs=5 \ --output_dir=gs:///bert-ner-output \ --use_tpu=true \ --tpu_name={TPU_WORKER} \ --num_hosts=1 \ --num_core_per_host=8



I0925 16:08:41.992299 140526175889280 BERT_NER.py:345] Writing example 0 of 8839
I0925 16:08:41.993903 140526175889280 BERT_NER.py:322] *** Example ***
I0925 16:08:41.994004 140526175889280 BERT_NER.py:323] guid: train-0
I0925 16:08:41.994082 140526175889280 BERT_NER.py:325] <EXAMPLES HERE WERE PRINTED>
I0925 16:08:49.373982 140526175889280 BERT_NER.py:345] Writing example 5000 of 8839
I0925 16:08:54.882352 140526175889280 BERT_NER.py:652] ***** Running training *****
I0925 16:08:54.882594 140526175889280 BERT_NER.py:653]   Num examples = 8839
I0925 16:08:54.882702 140526175889280 BERT_NER.py:654]   Batch size = 16
I0925 16:08:54.882768 140526175889280 BERT_NER.py:655]   Num steps = 2762
WARNING:tensorflow:From BERT_NER.py:369: The name tf.FixedLenFeature is deprecated. Please use tf.io.FixedLenFeature instead.

W0925 16:08:54.882921 140526175889280 deprecation_wrapper.py:119] From BERT_NER.py:369: The name tf.FixedLenFeature is deprecated. Please use tf.io.FixedLenFeature instead.

INFO:tensorflow:Querying Tensorflow master (grpc://10.123.58.138:8470) for TPU system metadata.
I0925 16:08:54.981006 140526175889280 tpu_system_metadata.py:78] Querying Tensorflow master (grpc://10.123.58.138:8470) for TPU system metadata.
2019-09-25 16:08:54.982253: W tensorflow/core/distributed_runtime/rpc/grpc_session.cc:356] GrpcSession::ListDevices will initialize the session with an empty graph and other defaults because the session has not yet been created.
INFO:tensorflow:Found TPU system:
I0925 16:08:54.994621 140526175889280 tpu_system_metadata.py:148] Found TPU system:
INFO:tensorflow:*** Num TPU Cores: 8
I0925 16:08:54.994820 140526175889280 tpu_system_metadata.py:149] *** Num TPU Cores: 8
INFO:tensorflow:*** Num TPU Workers: 1
I0925 16:08:54.994925 140526175889280 tpu_system_metadata.py:150] *** Num TPU Workers: 1
INFO:tensorflow:*** Num TPU Cores Per Worker: 8
I0925 16:08:54.995013 140526175889280 tpu_system_metadata.py:152] *** Num TPU Cores Per Worker: 8
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0, CPU, -1, 6259965955160652075)
I0925 16:08:54.995089 140526175889280 tpu_system_metadata.py:154] *** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0, CPU, -1, 6259965955160652075)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0, TPU, 17179869184, 3867316546800303253)
I0925 16:08:54.995739 140526175889280 tpu_system_metadata.py:154] *** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0, TPU, 17179869184, 3867316546800303253)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1, TPU, 17179869184, 7415059077902910229)
I0925 16:08:54.995835 140526175889280 tpu_system_metadata.py:154] *** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1, TPU, 17179869184, 7415059077902910229)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:2, TPU, 17179869184, 1255798912883217047)
I0925 16:08:54.995918 140526175889280 tpu_system_metadata.py:154] *** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:2, TPU, 17179869184, 1255798912883217047)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:3, TPU, 17179869184, 4232959688429871737)
I0925 16:08:54.995996 140526175889280 tpu_system_metadata.py:154] *** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:3, TPU, 17179869184, 4232959688429871737)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:4, TPU, 17179869184, 148906076188128027)
I0925 16:08:54.996069 140526175889280 tpu_system_metadata.py:154] *** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:4, TPU, 17179869184, 148906076188128027)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:5, TPU, 17179869184, 16880759925504833073)
I0925 16:08:54.996144 140526175889280 tpu_system_metadata.py:154] *** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:5, TPU, 17179869184, 16880759925504833073)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:6, TPU, 17179869184, 17262323872149031664)
I0925 16:08:54.996220 140526175889280 tpu_system_metadata.py:154] *** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:6, TPU, 17179869184, 17262323872149031664)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:7, TPU, 17179869184, 9740007696567887286)
I0925 16:08:54.996294 140526175889280 tpu_system_metadata.py:154] *** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:7, TPU, 17179869184, 9740007696567887286)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU_SYSTEM:0, TPU_SYSTEM, 8589934592, 6514551530700308569)
I0925 16:08:54.996368 140526175889280 tpu_system_metadata.py:154] *** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU_SYSTEM:0, TPU_SYSTEM, 8589934592, 6514551530700308569)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 17179869184, 987196198784000216)
I0925 16:08:54.996443 140526175889280 tpu_system_metadata.py:154] *** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 17179869184, 987196198784000216)
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
W0925 16:08:55.022911 140526175889280 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
INFO:tensorflow:Calling model_fn.
I0925 16:08:55.034002 140526175889280 estimator.py:1145] Calling model_fn.
WARNING:tensorflow:From BERT_NER.py:396: map_and_batch (from tensorflow.python.data.experimental.ops.batching) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.map(map_func, num_parallel_calls)` followed by `tf.data.Dataset.batch(batch_size, drop_remainder)`. Static tf.data optimizations will take care of using the fused implementation.
W0925 16:08:55.051713 140526175889280 deprecation.py:323] From BERT_NER.py:396: map_and_batch (from tensorflow.python.data.experimental.ops.batching) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.map(map_func, num_parallel_calls)` followed by `tf.data.Dataset.batch(batch_size, drop_remainder)`. Static tf.data optimizations will take care of using the fused implementation.
WARNING:tensorflow:From BERT_NER.py:379: The name tf.parse_single_example is deprecated. Please use tf.io.parse_single_example instead.

W0925 16:08:55.053081 140526175889280 deprecation_wrapper.py:119] From BERT_NER.py:379: The name tf.parse_single_example is deprecated. Please use tf.io.parse_single_example instead.

WARNING:tensorflow:From BERT_NER.py:383: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
W0925 16:08:55.056324 140526175889280 deprecation.py:323] From BERT_NER.py:383: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
I0925 16:08:55.139652 140526175889280 BERT_NER.py:481] *** Features ***
I0925 16:08:55.139906 140526175889280 BERT_NER.py:483]   name = input_ids, shape = (2, 512)
I0925 16:08:55.140040 140526175889280 BERT_NER.py:483]   name = label_ids, shape = (2, 512)
I0925 16:08:55.140140 140526175889280 BERT_NER.py:483]   name = mask, shape = (2, 512)
I0925 16:08:55.140238 140526175889280 BERT_NER.py:483]   name = segment_ids, shape = (2, 512)
WARNING:tensorflow:From /content/BERT-NER/BERT-NER/bert/modeling.py:171: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

W0925 16:08:55.140578 140526175889280 deprecation_wrapper.py:119] From /content/BERT-NER/BERT-NER/bert/modeling.py:171: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

WARNING:tensorflow:From /content/BERT-NER/BERT-NER/bert/modeling.py:410: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.

W0925 16:08:55.143127 140526175889280 deprecation_wrapper.py:119] From /content/BERT-NER/BERT-NER/bert/modeling.py:410: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.

WARNING:tensorflow:From /content/BERT-NER/BERT-NER/bert/modeling.py:491: The name tf.assert_less_equal is deprecated. Please use tf.compat.v1.assert_less_equal instead.

W0925 16:08:55.295414 140526175889280 deprecation_wrapper.py:119] From /content/BERT-NER/BERT-NER/bert/modeling.py:491: The name tf.assert_less_equal is deprecated. Please use tf.compat.v1.assert_less_equal instead.

WARNING:tensorflow:From /content/BERT-NER/BERT-NER/bert/modeling.py:359: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
W0925 16:08:55.335629 140526175889280 deprecation.py:506] From /content/BERT-NER/BERT-NER/bert/modeling.py:359: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
WARNING:tensorflow:From /content/BERT-NER/BERT-NER/bert/modeling.py:672: dense (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.dense instead.
W0925 16:08:55.356273 140526175889280 deprecation.py:323] From /content/BERT-NER/BERT-NER/bert/modeling.py:672: dense (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.dense instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0925 16:08:58.509919 140526175889280 deprecation.py:506] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/contrib/crf/python/ops/crf.py:213: dynamic_rnn (from tensorflow.python.ops.rnn) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `keras.layers.RNN(cell)`, which is equivalent to this API
W0925 16:08:58.630024 140526175889280 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/contrib/crf/python/ops/crf.py:213: dynamic_rnn (from tensorflow.python.ops.rnn) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `keras.layers.RNN(cell)`, which is equivalent to this API
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:2403: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
W0925 16:08:58.696721 140526175889280 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:2403: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From BERT_NER.py:503: The name tf.train.init_from_checkpoint is deprecated. Please use tf.compat.v1.train.init_from_checkpoint instead.

W0925 16:08:59.315157 140526175889280 deprecation_wrapper.py:119] From BERT_NER.py:503: The name tf.train.init_from_checkpoint is deprecated. Please use tf.compat.v1.train.init_from_checkpoint instead.

I0925 16:09:00.364924 140526175889280 BERT_NER.py:512] **** Trainable Variables ****
I0925 16:09:00.365132 140526175889280 BERT_NER.py:519]   name = bert/embeddings/word_embeddings:0, shape = (119547, 768), *INIT_FROM_CKPT*
I0925 16:09:00.365248 140526175889280 BERT_NER.py:519]   name = bert/embeddings/token_type_embeddings:0, shape = (2, 768), *INIT_FROM_CKPT*
I0925 16:09:00.365333 140526175889280 BERT_NER.py:519]   name = bert/embeddings/position_embeddings:0, shape = (512, 768), *INIT_FROM_CKPT*
I0925 16:09:00.365411 140526175889280 BERT_NER.py:519]   name = bert/embeddings/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.365506 140526175889280 BERT_NER.py:519]   name = bert/embeddings/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.365588 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_0/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.365669 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_0/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.365746 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_0/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.365823 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_0/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.365898 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_0/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.365974 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_0/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.366052 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_0/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.366128 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_0/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.366203 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_0/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.366277 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_0/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.366351 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_0/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
I0925 16:09:00.366427 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_0/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
I0925 16:09:00.366511 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_0/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
I0925 16:09:00.366588 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_0/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.366663 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_0/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.366736 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_0/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.366808 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_1/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.366881 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_1/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.366950 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_1/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.367029 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_1/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.367101 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_1/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.367176 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_1/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.367246 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_1/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.367319 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_1/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.367389 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_1/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.367469 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_1/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.367541 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_1/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
I0925 16:09:00.367618 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_1/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
I0925 16:09:00.367692 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_1/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
I0925 16:09:00.367766 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_1/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.367839 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_1/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.367913 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_1/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.367982 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_2/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.368060 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_2/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.368131 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_2/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.368203 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_2/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.368272 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_2/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.368343 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_2/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.368414 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_2/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.368498 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_2/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.368574 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_2/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.368650 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_2/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.368724 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_2/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
I0925 16:09:00.368798 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_2/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
I0925 16:09:00.368872 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_2/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
I0925 16:09:00.368946 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_2/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.369021 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_2/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.369090 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_2/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.369159 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_3/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.369231 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_3/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.369301 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_3/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.369375 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_3/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.369444 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_3/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.369528 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_3/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.369600 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_3/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.369672 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_3/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.369742 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_3/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.369810 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_3/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.369879 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_3/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
I0925 16:09:00.369952 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_3/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
I0925 16:09:00.370044 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_3/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
I0925 16:09:00.370115 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_3/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.370182 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_3/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.370250 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_3/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.370316 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_4/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.370386 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_4/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.370452 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_4/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.370534 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_4/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.370636 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_4/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.370710 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_4/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.370784 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_4/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.370858 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_4/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.370933 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_4/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.371005 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_4/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.371081 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_4/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
I0925 16:09:00.371155 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_4/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
I0925 16:09:00.371229 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_4/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
I0925 16:09:00.371303 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_4/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.371374 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_4/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.371443 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_4/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.371526 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_5/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.371606 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_5/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.371680 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_5/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.371755 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_5/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.371825 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_5/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.371898 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_5/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.371970 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_5/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.372047 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_5/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.372123 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_5/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.372192 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_5/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.372261 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_5/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
I0925 16:09:00.372335 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_5/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
I0925 16:09:00.372406 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_5/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
I0925 16:09:00.372489 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_5/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.372563 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_5/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.372632 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_5/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.372700 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_6/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.372776 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_6/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.372844 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_6/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.372918 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_6/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.372988 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_6/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.373065 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_6/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.373141 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_6/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.373215 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_6/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.373284 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_6/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.373353 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_6/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.373422 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_6/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
I0925 16:09:00.373504 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_6/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
I0925 16:09:00.373580 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_6/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
I0925 16:09:00.373656 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_6/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.373730 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_6/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.373804 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_6/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.373872 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_7/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.373944 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_7/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.374035 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_7/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.391075 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_7/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.391277 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_7/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.391394 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_7/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.391558 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_7/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.391663 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_7/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.391752 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_7/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.391836 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_7/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.391922 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_7/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
I0925 16:09:00.392026 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_7/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
I0925 16:09:00.392117 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_7/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
I0925 16:09:00.392211 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_7/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.392302 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_7/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.392399 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_7/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.392516 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_8/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.392611 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_8/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.392695 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_8/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.392786 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_8/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.392873 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_8/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.392988 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_8/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.393089 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_8/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.393186 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_8/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.393276 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_8/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.393377 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_8/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.393479 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_8/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
I0925 16:09:00.393579 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_8/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
I0925 16:09:00.393669 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_8/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
I0925 16:09:00.393779 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_8/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.393870 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_8/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.393961 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_8/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.394059 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_9/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.394162 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_9/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.394260 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_9/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.394358 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_9/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.394447 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_9/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.394562 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_9/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.394654 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_9/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.394748 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_9/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.394838 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_9/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.394928 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_9/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.395027 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_9/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
I0925 16:09:00.395125 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_9/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
I0925 16:09:00.395215 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_9/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
I0925 16:09:00.395309 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_9/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.395398 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_9/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.395501 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_9/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.395595 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_10/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.395691 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_10/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.395781 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_10/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.395876 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_10/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.395966 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_10/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.396072 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_10/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.396164 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_10/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.396258 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_10/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.396349 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_10/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.396440 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_10/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.396547 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_10/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
I0925 16:09:00.396646 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_10/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
I0925 16:09:00.396739 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_10/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
I0925 16:09:00.396833 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_10/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.396923 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_10/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.397031 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_10/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.397120 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_11/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.397211 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_11/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.397298 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_11/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.397391 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_11/attention/self/key/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.397509 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_11/attention/self/value/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.397610 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_11/attention/self/value/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.397701 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_11/attention/output/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.397793 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_11/attention/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.397891 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_11/attention/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.397973 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_11/attention/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.398064 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_11/intermediate/dense/kernel:0, shape = (768, 3072), *INIT_FROM_CKPT*
I0925 16:09:00.398151 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_11/intermediate/dense/bias:0, shape = (3072,), *INIT_FROM_CKPT*
I0925 16:09:00.398233 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_11/output/dense/kernel:0, shape = (3072, 768), *INIT_FROM_CKPT*
I0925 16:09:00.398319 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_11/output/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.398401 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_11/output/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.398521 140526175889280 BERT_NER.py:519]   name = bert/encoder/layer_11/output/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.398616 140526175889280 BERT_NER.py:519]   name = bert/pooler/dense/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
I0925 16:09:00.398703 140526175889280 BERT_NER.py:519]   name = bert/pooler/dense/bias:0, shape = (768,), *INIT_FROM_CKPT*
I0925 16:09:00.398785 140526175889280 BERT_NER.py:519]   name = dense_37/kernel:0, shape = (768, 25)
I0925 16:09:00.398870 140526175889280 BERT_NER.py:519]   name = dense_37/bias:0, shape = (25,)
I0925 16:09:00.398950 140526175889280 BERT_NER.py:519]   name = crf_loss/transition:0, shape = (25, 25)
WARNING:tensorflow:From BERT_NER.py:524: The name tf.train.get_global_step is deprecated. Please use tf.compat.v1.train.get_global_step instead.

W0925 16:09:00.399126 140526175889280 deprecation_wrapper.py:119] From BERT_NER.py:524: The name tf.train.get_global_step is deprecated. Please use tf.compat.v1.train.get_global_step instead.

WARNING:tensorflow:From BERT_NER.py:525: The name tf.train.LoggingTensorHook is deprecated. Please use tf.estimator.LoggingTensorHook instead.

W0925 16:09:00.399387 140526175889280 deprecation_wrapper.py:119] From BERT_NER.py:525: The name tf.train.LoggingTensorHook is deprecated. Please use tf.estimator.LoggingTensorHook instead.

WARNING:tensorflow:From /content/BERT-NER/BERT-NER/bert/optimization.py:27: The name tf.train.get_or_create_global_step is deprecated. Please use tf.compat.v1.train.get_or_create_global_step instead.

W0925 16:09:00.399641 140526175889280 deprecation_wrapper.py:119] From /content/BERT-NER/BERT-NER/bert/optimization.py:27: The name tf.train.get_or_create_global_step is deprecated. Please use tf.compat.v1.train.get_or_create_global_step instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/optimizer_v2/learning_rate_schedule.py:409: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
W0925 16:09:00.407652 140526175889280 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/optimizer_v2/learning_rate_schedule.py:409: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
INFO:tensorflow:Create CheckpointSaverHook.
I0925 16:09:13.430385 140526175889280 basic_session_run_hooks.py:541] Create CheckpointSaverHook.
INFO:tensorflow:Done calling model_fn.
I0925 16:09:13.750263 140526175889280 estimator.py:1147] Done calling model_fn.
INFO:tensorflow:TPU job name worker
I0925 16:09:17.222844 140526175889280 tpu_estimator.py:499] TPU job name worker
INFO:tensorflow:Graph was finalized.
I0925 16:09:18.917126 140526175889280 monitored_session.py:240] Graph was finalized.
INFO:tensorflow:Running local_init_op.
I0925 16:09:31.122221 140526175889280 session_manager.py:500] Running local_init_op.
INFO:tensorflow:Done running local_init_op.
I0925 16:09:31.839573 140526175889280 session_manager.py:502] Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into gs://bert-data-1a/bert-ner-output/model.ckpt.
I0925 16:09:41.540014 140526175889280 basic_session_run_hooks.py:606] Saving checkpoints for 0 into gs://bert-data-1a/bert-ner-output/model.ckpt.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py:741: Variable.load (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Prefer Variable.assign which has equivalent behavior in 2.X.
W0925 16:10:18.328432 140526175889280 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py:741: Variable.load (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Prefer Variable.assign which has equivalent behavior in 2.X.
INFO:tensorflow:Initialized dataset iterators in 0 seconds
I0925 16:10:19.941890 140526175889280 util.py:98] Initialized dataset iterators in 0 seconds
INFO:tensorflow:Installing graceful shutdown hook.
I0925 16:10:19.942312 140526175889280 session_support.py:332] Installing graceful shutdown hook.
2019-09-25 16:10:19.942710: W tensorflow/core/distributed_runtime/rpc/grpc_session.cc:356] GrpcSession::ListDevices will initialize the session with an empty graph and other defaults because the session has not yet been created.
INFO:tensorflow:Creating heartbeat manager for ['/job:worker/replica:0/task:0/device:CPU:0']
I0925 16:10:19.947024 140526175889280 session_support.py:82] Creating heartbeat manager for ['/job:worker/replica:0/task:0/device:CPU:0']
INFO:tensorflow:Configuring worker heartbeat: shutdown_mode: WAIT_FOR_COORDINATOR

I0925 16:10:19.948786 140526175889280 session_support.py:105] Configuring worker heartbeat: shutdown_mode: WAIT_FOR_COORDINATOR

INFO:tensorflow:Init TPU system
I0925 16:10:19.952322 140526175889280 tpu_estimator.py:557] Init TPU system
INFO:tensorflow:Initialized TPU in 7 seconds
I0925 16:10:27.413597 140526175889280 tpu_estimator.py:566] Initialized TPU in 7 seconds
INFO:tensorflow:Starting infeed thread controller.
I0925 16:10:27.414346 140524983363328 tpu_estimator.py:514] Starting infeed thread controller.
INFO:tensorflow:Starting outfeed thread controller.
I0925 16:10:27.414748 140524953216768 tpu_estimator.py:533] Starting outfeed thread controller.
INFO:tensorflow:Enqueue next (1000) batch(es) of data to infeed.
I0925 16:10:28.176502 140526175889280 tpu_estimator.py:590] Enqueue next (1000) batch(es) of data to infeed.
INFO:tensorflow:Dequeue next (1000) batch(es) of data from outfeed.
I0925 16:10:28.177400 140526175889280 tpu_estimator.py:594] Dequeue next (1000) batch(es) of data from outfeed.
ERROR:tensorflow:Error recorded from outfeed: Session is closed.
E0925 16:10:29.093775 140524953216768 error_handling.py:70] Error recorded from outfeed: Session is closed.
ERROR:tensorflow:Error recorded from training_loop: Operation 'Mean' has been marked as not fetchable.
E0925 16:10:29.094005 140526175889280 error_handling.py:70] Error recorded from training_loop: Operation 'Mean' has been marked as not fetchable.
ERROR:tensorflow:Error recorded from infeed: Step was cancelled by an explicit call to `Session::Close()`.
E0925 16:10:29.094139 140524983363328 error_handling.py:70] Error recorded from infeed: Step was cancelled by an explicit call to `Session::Close()`.
INFO:tensorflow:training_loop marked as finished
I0925 16:10:29.094764 140526175889280 error_handling.py:96] training_loop marked as finished
WARNING:tensorflow:Reraising captured error
W0925 16:10:29.095284 140526175889280 error_handling.py:130] Reraising captured error
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1356, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1341, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1429, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.FailedPreconditionError: Session is closed.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "BERT_NER.py", line 735, in <module>
    tf.app.run()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "BERT_NER.py", line 663, in main
    estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 2876, in train
    rendezvous.raise_errors()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/tpu/error_handling.py", line 131, in raise_errors
    six.reraise(typ, value, traceback)
  File "/usr/local/lib/python3.6/dist-packages/six.py", line 693, in reraise
    raise value
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/tpu/error_handling.py", line 104, in catch_errors
    yield
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 539, in _run_outfeed
    session.run(self._dequeue_ops)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 950, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1173, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1350, in _do_run
    run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1370, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.FailedPreconditionError: Session is closed.```

Do you have any idea why this error is occurring? 

Thanks in advance