crownpku / Information-Extraction-Chinese

Chinese Named Entity Recognition with IDCNN/biLSTM+CRF, and Relation Extraction with biGRU+2ATT 中文实体识别与关系提取
2.22k stars 814 forks source link

改写成提供API接口的方式 #46

Open lovenodejs opened 6 years ago

lovenodejs commented 6 years ago

你好,我想把它改写成API接口的方式,使用flask 提供web服务,发现最后出来的结果非常差,下面试代码

def predict_line(): config = load_config(FLAGS.config_file) logger = get_logger(FLAGS.log_file)

limit GPU memory

tf_config = tf.ConfigProto()
tf_config.gpu_options.allow_growth = True
with open(FLAGS.map_file, "rb") as f:
    char_to_id, id_to_char, tag_to_id, id_to_tag = pickle.load(f)
with tf.Session(config=tf_config) as sess:
    model = create_model(sess, Model, FLAGS.ckpt_path, load_word2vec, config, id_to_char, logger)
    #result = model.evaluate_line(sess, input_from_line(line, char_to_id), id_to_tag)
    #return result
    return model

先返回模型,再在flask中调用模型,原有的会每次都会create model ,

import tensorflow as tf import numpy as np from model import Model from loader import load_sentences, update_tag_scheme from loader import char_mapping, tag_mapping from loader import augment_with_pretrained, prepare_dataset from utils import get_logger, make_path, clean, create_model, save_model from utils import print_config, save_config, load_config, test_ner from data_utils import load_word2vec, create_input, input_from_line, BatchManager

flags = tf.app.flags

flags.DEFINE_string("map_file","maps.pkl","file for maps")

app = Flask(name)

predictmodel = predict_line() print('model is loaded')

@app.route('/getNameModel', methods=['POST']) def getNameModel(): title = request.json['title'] print(title) tf_config = tf.ConfigProto() tf_config.gpu_options.allow_growth = True with open("maps.pkl", "rb") as f: char_to_id, id_to_char, tag_to_id, id_to_tag = pickle.load(f) result = '' with tf.Session(config=tf_config) as sess: sess.run(tf.global_variables_initializer()) result = predictmodel.evaluate_line(sess, input_from_line(title, char_to_id), id_to_tag) print(result) return json.dumps(result, ensure_ascii=False)

可是返回结果是: {"entities": [{"end": 2, "start": 1, "type": "PER", "word": "想"}, {"end": 3, "start": 2, "type": "PER", "word": "集"}, {"end": 7, "start": 6, "type": "LOC", "word": "部"}, {"end": 8, "start": 7, "type": "ORG", "word": "位"}, {"end": 14, "start": 13, "type": "ORG", "word": "席"}, {"end": 15, "start": 12, "type": "LOC", "word": "联团的总于北京,首执"}, {"end": 16, "start": 15, "type": "PER", "word": "行"}], "string": "联想集团的总部位于北京,首席执行官是杨元庆先生"}

什么原因呢?谢谢

crownpku commented 6 years ago

非常好的想法! 我还没有仔细看你的代码,先提醒你两个地方:

  1. 确定模型已经load进来了
  2. 确认Flask传进来的字符编码是正确的

也欢迎你提交pull request把这个功能的代码添加进来。

lovenodejs commented 6 years ago

你看下我上面的代码,其实问题就是, with tf.Session(config=tf_config) as sess: sess.run(tf.global_variables_initializer()) result = predictmodel.evaluate_line(sess, input_from_line(title, char_to_id), id_to_tag) print(result) return json.dumps(result, ensure_ascii=False)

我新加了sess.run(tf.global_variables_initializer()), 这个代码,因为如果不加的话会出现下面的错误: Traceback (most recent call last): File "/root/anaconda2/envs/tensorflow36/lib/python3.6/site-packages/flask/app.py", line 1982, in wsgi_app response = self.full_dispatch_request() File "/root/anaconda2/envs/tensorflow36/lib/python3.6/site-packages/flask/app.py", line 1614, in full_dispatch_request rv = self.handle_user_exception(e) File "/root/anaconda2/envs/tensorflow36/lib/python3.6/site-packages/flask/app.py", line 1517, in handle_user_exception reraise(exc_type, exc_value, tb) File "/root/anaconda2/envs/tensorflow36/lib/python3.6/site-packages/flask/_compat.py", line 33, in reraise raise value File "/root/anaconda2/envs/tensorflow36/lib/python3.6/site-packages/flask/app.py", line 1612, in full_dispatch_request rv = self.dispatch_request() File "/root/anaconda2/envs/tensorflow36/lib/python3.6/site-packages/flask/app.py", line 1598, in dispatch_request return self.view_functionsrule.endpoint File "app.py", line 41, in getNameModel result = predictmodel.evaluate_line(sess, input_from_line(title, char_to_id), id_to_tag) File "/home/wzxy/laishaohui/ner/NER_IDCNN_CRF/model.py", line 395, in evaluate_line trans = self.trans.eval() File "/root/anaconda2/envs/tensorflow36/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 463, in eval return self._variable.eval(session=session) File "/root/anaconda2/envs/tensorflow36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 606, in eval return _eval_using_default_session(self, feed_dict, self.graph, session) File "/root/anaconda2/envs/tensorflow36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3928, in _eval_using_default_session return session.run(tensors, feed_dict) File "/root/anaconda2/envs/tensorflow36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 789, in run run_metadata_ptr) File "/root/anaconda2/envs/tensorflow36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 997, in _run feed_dict_string, options, run_metadata) File "/root/anaconda2/envs/tensorflow36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run target_list, options, run_metadata) File "/root/anaconda2/envs/tensorflow36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value crf_loss/transitions [[Node: _retval_crf_loss/transitions_0_0 = _RetvalT=DT_FLOAT, index=0, _device="/job:localhost/replica:0/task:0/cpu:0"]]

如果加了sess.run(tf.global_variables_initializer()) 就不会报错,但是模型返回的结果非常差 {"entities": [{"end": 2, "start": 1, "type": "PER", "word": "想"}, {"end": 3, "start": 2, "type": "PER", "word": "集"}, {"end": 7, "start": 6, "type": "LOC", "word": "部"}, {"end": 8, "start": 7, "type": "ORG", "word": "位"}, {"end": 14, "start": 13, "type": "ORG", "word": "席"}, {"end": 15, "start": 12, "type": "LOC", "word": "联团的总于北京,首执"}, {"end": 16, "start": 15, "type": "PER", "word": "行"}], "string": "联想集团的总部位于北京,首席执行官是杨元庆先生"}

就这个问题解决不了,能解决还是非常愿意pull request 的,

lovenodejs commented 6 years ago

编码是正确的,模型确实load了 /root/anaconda2/envs/tensorflow36/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py:93: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory. "Converting sparse IndexedSlices to a dense Tensor of unknown shape. " 2018-04-12 11:00:14,732 - train.log - INFO - Reading model parameters from ckpt_IDCNN/ner.ckpt model is loaded

crownpku commented 6 years ago

这是有问题的,你load model时的sess和你做prediction的sess不是同一个session下面,你调用了sess.run(tf.global_variables_initializer())相当于把所有参数都随机化了,模型并没有用到。 需要把整个flask部分放去原先代码中

with tf.Session(config=tf_config) as sess:
    model = create_model(sess, Model, FLAGS.ckpt_path, load_word2vec, config, id_to_char, logger)

的同一段代码域里面

lovenodejs commented 6 years ago

@app.route('/getNameModel', methods=['POST']) def predict_line(): config = load_config(FLAGS.config_file) logger = get_logger(FLAGS.log_file)

limit GPU memory

tf_config = tf.ConfigProto()
tf_config.gpu_options.allow_growth = True
with open(FLAGS.map_file, "rb") as f:
    char_to_id, id_to_char, tag_to_id, id_to_tag = pickle.load(f)
result = ''
with tf.Session(config=tf_config) as sess:
    sess.run(tf.global_variables_initializer())
    model = create_model(sess, Model, FLAGS.ckpt_path, load_word2vec, config, id_to_char, logger)
    title = request.json['title']
    print(title)
    result = model.evaluate_line(sess, input_from_line(title, char_to_id), id_to_tag)
return json.dumps(result, ensure_ascii=False)

可是这种方式,第一次调用没问题,结果正确,第二次就会报错了

lovenodejs commented 6 years ago

每次请求进来都会create_model

crownpku commented 6 years ago
sess = tf.Session(config=tf_config)
sess.run(tf.global_variables_initializer())
model = create_model(sess, Model, FLAGS.ckpt_path, load_word2vec, config, id_to_char, logger)

然后把sess和model这两个变量一起传给你的flask function试下

lovenodejs commented 6 years ago

就像这样: def predict_line(): config = load_config(FLAGS.config_file) logger = get_logger(FLAGS.log_file)

limit GPU memory

tf_config = tf.ConfigProto()
tf_config.gpu_options.allow_growth = True
with open(FLAGS.map_file, "rb") as f:
    char_to_id, id_to_char, tag_to_id, id_to_tag = pickle.load(f)
with tf.Session(config=tf_config) as sess:
    model = create_model(sess, Model, FLAGS.ckpt_path, load_word2vec, config, id_to_char, logger)
    return model,sess

返回 model 和 sess

然后在 flask 中接: predictmodel,sess = predict_line()

print('model is loaded')

@app.route('/getNameModel', methods=['POST']) def getNameModel(): title = request.json['title'] print(title) tf_config = tf.ConfigProto() tf_config.gpu_options.allow_growth = True with open("maps.pkl", "rb") as f: char_to_id, id_to_char, tag_to_id, id_to_tag = pickle.load(f) result = predictmodel.evaluate_line(sess, input_from_line(title, char_to_id), id_to_tag)

print(result)

return json.dumps(result, ensure_ascii=False)

但是会报:

[2018-04-12 16:15:17,870] ERROR in app: Exception on /getNameModel [POST] Traceback (most recent call last): File "/root/anaconda2/envs/tensorflow36/lib/python3.6/site-packages/flask/app.py", line 1982, in wsgi_app response = self.full_dispatch_request() File "/root/anaconda2/envs/tensorflow36/lib/python3.6/site-packages/flask/app.py", line 1614, in full_dispatch_request rv = self.handle_user_exception(e) File "/root/anaconda2/envs/tensorflow36/lib/python3.6/site-packages/flask/app.py", line 1517, in handle_user_exception reraise(exc_type, exc_value, tb) File "/root/anaconda2/envs/tensorflow36/lib/python3.6/site-packages/flask/_compat.py", line 33, in reraise raise value File "/root/anaconda2/envs/tensorflow36/lib/python3.6/site-packages/flask/app.py", line 1612, in full_dispatch_request rv = self.dispatch_request() File "/root/anaconda2/envs/tensorflow36/lib/python3.6/site-packages/flask/app.py", line 1598, in dispatch_request return self.view_functionsrule.endpoint File "app.py", line 38, in getNameModel result = predictmodel.evaluate_line(sess, input_from_line(title, char_to_id), id_to_tag) File "/home/wzxy/laishaohui/ner/NER_IDCNN_CRF/model.py", line 395, in evaluate_line trans = self.trans.eval() File "/root/anaconda2/envs/tensorflow36/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 463, in eval return self._variable.eval(session=session) File "/root/anaconda2/envs/tensorflow36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 606, in eval return _eval_using_default_session(self, feed_dict, self.graph, session) File "/root/anaconda2/envs/tensorflow36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3914, in _eval_using_default_session raise ValueError("Cannot evaluate tensor using eval(): No default " ValueError: Cannot evaluate tensor using eval(): No default session is registered. Use with sess.as_default() or pass an explicit session to eval(session=sess)

xtuyaowu commented 6 years ago

这个问题解决了没?

lovenodejs commented 6 years ago

还没,很奇怪,总是有问题

xtuyaowu commented 6 years ago

我也试了,没有解决。

tjuGaoxp commented 6 years ago

这样好像可以: 1、新建py from flask import Flask,request,json from main import *

app = Flask(name)

config = load_config(FLAGS.config_file) logger = get_logger(FLAGS.log_file)

limit GPU memory

tf_config = tf.ConfigProto() tf_config.gpu_options.allow_growth = True f = open(FLAGS.map_file, "rb") tf_sess = tf.Session(config=tf_config) char_to_id, id_to_char, tag_to_id, id_to_tag = pickle.load(f) model = create_model(tf_sess, Model, FLAGS.ckpt_path, load_word2vec, config, id_to_char, logger)

@app.route('/predict_ner', methods=['POST']) def predict_ner(): sentence = request.form.to_dict().get('sentence') print(sentence) result = model.evaluate_line(tf_sess, input_from_line(sentence, char_to_id), id_to_tag) print(result) return json.dumps(result, ensure_ascii=False)

if name == 'main': app.run(debug='true')

2、修改 model.py 第 395行 trans = self.trans.eval(sess)