PaddlePaddle / PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
https://paddlenlp.readthedocs.io
Apache License 2.0
12.1k stars 2.94k forks source link

使用 ernie + crf 进行序列标注任务时,无法使用paddle.jit.save 保存,可能是paddlenlp中的自带的crf代码写法不支持动静转换,请问crf代码这块后续会调整吗,还是直接推荐使用静态版里面的crf #419

Closed Wakp closed 3 years ago

joey12300 commented 3 years ago

@Wakp 你好,请问一下你的padldenlp的版本是多少呢?方便提供代码吗?

Wakp commented 3 years ago

你好,使用的paddlenlp 2.0.0rc18c 我提供一个demo,经调试我发现可能是这个crf代码里面的一些写法在动态转静态时无法被识别和转换导致的。 比如crf.py文件中的389行: “scores, last_ids = alpha.max(1), alpha.argmax(1).numpy().tolist()” 在被转换时会被误解析成”onvert_call(alpha.argmax)(1).tolist)()“,此外像python 中的常用函数“reversed()”在动转静时均会报错。

———————————————————demo———————————————————

import paddle
  import paddle.nn as nn
  from paddlenlp.transformers import BertModel, BertTokenizer, ErnieModel, ErnieTokenizer
  from paddlenlp.layers import crf
  import numpy as np

  class MutilErnieCrf(nn.Layer):
      def __init__(self, out_num=3, model_name_or_path='bert-wwm-chinese', dropout=None):
          super(MutilErnieCrf, self).__init__()
          self.ernie = BertModel.from_pretrained(model_name_or_path)
          self.linear_to_label = nn.Linear(768, out_num)
          self.crf = crf.LinearChainCrf(out_num, crf_lr=0.1, with_start_stop_tag=False)
          self.crf_loss = crf.LinearChainCrfLoss(self.crf)
          self.viterbi = crf.ViterbiDecoder(self.crf.transitions, with_start_stop_tag=False)
          self.relu = nn.ReLU6()
          self.dropout = nn.Dropout(dropout if dropout is not None else 0.1)

      def forward(self, X, lengths):
          embedding = self.ernie(X)[0]
          seq_len = embedding.shape[1]
          feature_drop = self.dropout(embedding)
          feature_linear = self.linear_to_label(feature_drop)
          res_feature = self.relu(feature_linear)
          scores, path = self.viterbi(res_feature, lengths)
          batch_path = list(path.numpy()) + [0] * (seq_len - len(path))
          return batch_path

      def compute_loss(self, inputs, lengths, labels, old_version_labels=None):
          loss = self.crf_loss(inputs, lengths, labels, old_version_labels=None)
          return loss

  inputs = np.asarray([[1, 2, 3, 4, 5, 6, 7, 8]], dtype="int32")
  lengths = np.asarray([8], dtype="int64")
  labels = np.asarray([1, 2, 3, 1, 2, 3, 1, 2], dtype="int64")
  inputs_tensor = paddle.to_tensor(inputs, dtype="int64")
  lengths_tensor = paddle.to_tensor(lengths, dtype="int64")
  labels_tensor = paddle.to_tensor(labels, dtype="int64")
  model = MutilErnieCrf()
  Y = model(inputs_tensor, lengths_tensor)
  print(Y)
  path = "example.dy_model/linear"
  paddle.jit.save(
      layer=model,
      path=path,
      input_spec=[inputs_tensor, lengths_tensor])
joey12300 commented 3 years ago

@Wakp 您好,目前CRF已经在最新版(rc25)支持动转静。您的代码batch_path = list(path.numpy()) + [0] * (seq_len - len(path))中有numpy()的语法,动转静并不支持,可以直接使用返回的path作为batch_path。 可以尝试pip install --upgrade paddlenlp -i https://pypi.org/simple更新paddlenlp。可参考lac的模型导出示例完成动转静,注意InputSpce参数的写法。https://github.com/PaddlePaddle/PaddleNLP/blob/develop/examples/lexical_analysis/export_model.py

Wakp commented 3 years ago

好的,谢谢

joey12300 commented 3 years ago

@Wakp 我刚使用您的代码写了一版,可作参考

import paddle
import paddle.nn as nn
from paddlenlp.transformers import BertModel, BertTokenizer, ErnieModel, ErnieTokenizer
from paddlenlp.layers import crf
from paddle.static import InputSpec
import numpy as np

class MutilErnieCrf(nn.Layer):
    def __init__(self, out_num=3, model_name_or_path='bert-wwm-chinese', dropout=None):
        super(MutilErnieCrf, self).__init__()
        self.ernie = BertModel.from_pretrained(model_name_or_path)
        self.linear_to_label = nn.Linear(768, out_num)
        self.crf = crf.LinearChainCrf(out_num, crf_lr=0.1, with_start_stop_tag=False)
        self.crf_loss = crf.LinearChainCrfLoss(self.crf)
        self.viterbi = crf.ViterbiDecoder(self.crf.transitions, with_start_stop_tag=False)
        self.relu = nn.ReLU6()
        self.dropout = nn.Dropout(dropout if dropout is not None else 0.1)

    def forward(self, X, lengths):
        embedding = self.ernie(X)[0]
        seq_len = embedding.shape[1]
        feature_drop = self.dropout(embedding)
        feature_linear = self.linear_to_label(feature_drop)
        res_feature = self.relu(feature_linear)
        scores, path = self.viterbi(res_feature, lengths)
        return path

    def compute_loss(self, inputs, lengths, labels, old_version_labels=None):
        loss = self.crf_loss(inputs, lengths, labels, old_version_labels=None)
        return loss

model=MutilErnieCrf()
paddle.jit.to_static(model, input_spec=[
    InputSpec(shape=[None, None], dtype="int64", name='token_ids'),
    InputSpec(shape=[None], dtype="int64", name='length')])

paddle.jit.save(model, "test")