BrikerMan / Kashgari

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
http://kashgari.readthedocs.io/
Apache License 2.0
2.4k stars 441 forks source link

[BUG] build_tpu_model is not supported for bi-lstm-crf #175

Open PoetCoderJun opened 5 years ago

PoetCoderJun commented 5 years ago

not support bi-lstm-crf

My code

import numpy as np
import pandas as pd
import kashgari
import re
from kashgari.embeddings import BERTEmbedding
from kashgari.tasks.labeling import BiLSTM_CRF_Model
import precess_data
from sklearn.model_selection import train_test_split
from tensorflow.python import keras
from kashgari.callbacks import EvalCallBack
if __name__ == '__main__':
    data = pd.read_excel("./data/2019data.xlsx")
    result = precess_data.get_ners_postion(data)
    result = np.array(result)
    train_x, test_x, train_y, test_y = train_test_split(result[:,0],  result[:,1], test_size=0.3, random_state=0)
    train_x = list(train_x)
    train_y = list(train_y)
    test_x = list(test_x)
    test_y = list(test_y)
    embedding = BERTEmbedding('chinese_L-12_H-768_A-12',
                                  task = kashgari.LABELING,
                                  sequence_length = 110)
    model = BiLSTM_CRF_Model(embedding)
    import os
    import kashgari
    import tensorflow as tf
    tpu_address = 'grpc://' + os.environ['COLAB_TPU_ADDR']
    strategy = tf.contrib.tpu.TPUDistributionStrategy(
        tf.contrib.cluster_resolver.TPUClusterResolver(tpu=tpu_address)
    )
    model.build_tpu_model(strategy, train_x, train_y, test_x, test_y)
    model.compile_model()
    model.fit(train_x, train_y, test_x, test_y, batch_size=20, epochs = 4)
    model.evaluate(test_x, test_y)

The output is:

WARNING: Logging before flag parsing goes to stderr. W0724 02:31:49.620104 139957599934336 lazy_loader.py:50] The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see:

18 frames FailedPreconditionError: From /job:worker/replica:0/task:0: Error while reading resource variable layer_crf/transitions from Container: worker. This could mean that the variable was uninitialized. Not found: Resource worker/layer_crf/transitions/N10tensorflow3VarE does not exist. [[{{node layer_crf/transitions/Read/ReadVariableOp}}]]

During handling of the above exception, another exception occurred:

FailedPreconditionError Traceback (most recent call last) /usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args) 1368 pass 1369 message = error_interpolation.interpolate(message, self._graph) -> 1370 raise type(e)(node_def, op, message) 1371 1372 def _extend_graph(self):

FailedPreconditionError: From /job:worker/replica:0/task:0: Error while reading resource variable layer_crf/transitions from Container: worker. This could mean that the variable was uninitialized. Not found: Resource worker/layer_crf/transitions/N10tensorflow3VarE does not exist. [[node layer_crf/transitions/Read/ReadVariableOp (defined at /usr/local/lib/python3.6/dist-packages/kashgari/layers/crf.py:80) ]]

Original stack trace for 'layer_crf/transitions/Read/ReadVariableOp': File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py", line 16, in app.launch_new_instance() File "/usr/local/lib/python3.6/dist-packages/traitlets/config/application.py", line 658, in launch_instance app.start() File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelapp.py", line 477, in start ioloop.IOLoop.instance().start() File "/usr/local/lib/python3.6/dist-packages/tornado/ioloop.py", line 888, in start handler_func(fd_obj, events) File "/usr/local/lib/python3.6/dist-packages/tornado/stack_context.py", line 277, in null_wrapper return fn(*args, kwargs) File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 450, in _handle_events self._handle_recv() File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 480, in _handle_recv self._run_callback(callback, msg) File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 432, in _run_callback callback(*args, *kwargs) File "/usr/local/lib/python3.6/dist-packages/tornado/stack_context.py", line 277, in null_wrapper return fn(args, kwargs) File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 283, in dispatcher return self.dispatch_shell(stream, msg) File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 235, in dispatch_shell handler(stream, idents, msg) File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 399, in execute_request user_expressions, allow_stdin) File "/usr/local/lib/python3.6/dist-packages/ipykernel/ipkernel.py", line 196, in do_execute res = shell.run_cell(code, store_history=store_history, silent=silent) File "/usr/local/lib/python3.6/dist-packages/ipykernel/zmqshell.py", line 533, in run_cell return super(ZMQInteractiveShell, self).run_cell(args, kwargs) File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2718, in run_cell interactivity=interactivity, compiler=compiler, result=result) File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2822, in run_ast_nodes if self.run_code(code, result): File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2882, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 2, in model.build_tpu_model(strategy, train_x, train_y, test_x, test_y) File "/usr/local/lib/python3.6/dist-packages/kashgari/tasks/base_model.py", line 195, in build_tpu_model self.build_model_arc() File "/usr/local/lib/python3.6/dist-packages/kashgari/tasks/labeling/models.py", line 114, in build_model_arc output_tensor = layer_crf(tensor) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 591, in call self._maybe_build(inputs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 1881, in _maybe_build self.build(input_shapes) File "/usr/local/lib/python3.6/dist-packages/kashgari/layers/crf.py", line 80, in build trainable=True) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 384, in add_weight aggregation=aggregation) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/tracking/base.py", line 663, in _add_variable_with_custom_getter kwargs_for_getter) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer_utils.py", line 155, in make_variable shape=variable_shape if variable_shape.rank else None) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variables.py", line 259, in call return cls._variable_v1_call(args, kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variables.py", line 220, in _variable_v1_call shape=shape) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variables.py", line 198, in previous_getter = lambda kwargs: default_variable_creator(None, kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variable_scope.py", line 2495, in default_variable_creator shape=shape) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variables.py", line 263, in call return super(VariableMetaclass, cls).call(*args, *kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/resource_variable_ops.py", line 460, in init shape=shape) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/resource_variable_ops.py", line 649, in _init_from_args value = self._read_variable_op() File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/resource_variable_ops.py", line 935, in _read_variable_op self._dtype) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_resource_variable_ops.py", line 587, in read_variable_op "ReadVariableOp", resource=resource, dtype=dtype, name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(args, kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3616, in create_op op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 2005, in init self._traceback = tf_stack.extract_stack()

BrikerMan commented 5 years ago

CRF layer does not support multi-GPU (#174) and TPU, you may use bi-lstm for training on TPU. We will try to make CRF layer compatible with multi-GPU and TPU.

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.