thu-coai / CrossWOZ

A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset
Apache License 2.0
645 stars 114 forks source link

RuntimeError: Error(s) in loading state_dict for JointBERT #22

Closed lerry-lee closed 3 years ago

lerry-lee commented 3 years ago

When I execute convlab2/policy/mle/crosswoz/evaluate.py or use ConvLab-2 to build a system to test, an error occurred aflter load from model

Load from model_file param
Load from .../CrossWOZ/convlab2/nlu/jointBERT/crosswoz/output/all_context/pytorch_model.bin
hfl/chinese-bert-wwm-ext
RuntimeError: Error(s) in loading state_dict for JointBERT:
    Missing key(s) in state_dict: "bert.embeddings.position_ids". 

I have downloaded the NLU model, but how can I solve this problem?

zqwerty commented 3 years ago

It may be caused by the discrepancy between the checkpoint and the transformers package. What's your version of the transformers? Please see https://github.com/thu-coai/ConvLab-2/issues/109

lerry-lee commented 3 years ago

thanks, my version of the transformers is 4.3.3, and I am going to downgrade and try

lerry-lee commented 3 years ago

I imitated the example of ConvLab-2 and combined a dialogue system with BERTNLU, RUleDST, RulePolicy, TemplateNLG, but the result is Error in processing: The code is as follows

# common import: convlab2.$module.$model.$dataset
from convlab2.nlu.jointBERT.crosswoz import BERTNLU
from convlab2.dst.rule.crosswoz.dst import RuleDST
from convlab2.policy.rule.crosswoz import Simulator
from convlab2.nlg.template.crosswoz import TemplateNLG
from convlab2.dialog_agent import PipelineAgent, BiSession
from pprint import pprint
import random
import numpy as np
import torch

sys_nlu = BERTNLU()
# simple rule DST
sys_dst = RuleDST()
# rule policy
sys_policy = Simulator()
# template NLG
sys_nlg = TemplateNLG(is_user=False)
# assemble
sys_agent = PipelineAgent(sys_nlu, sys_dst, sys_policy, sys_nlg, name='sys')

sys_agent.response("你好,可以帮我推荐一家评分5分以上的景点吗?")

result is

sys Request+酒店+名称 Inform+酒店+评分
sys Request+酒店+名称 Inform+酒店+评分

Error in processing:
[['General', 'greet', 'none', 'none'],
 ['Inform', '酒店', '价格', '200-300元'],
 ['Inform', '酒店', '评分', '4.5分以上'],
 ['Request', '酒店', '名称', '']]
''

when I try another user utterance sys_agent.response("你好,帮我找一个免费的景点") is the same error

sys Request+景点+名称 Inform+景点+门票+免费
sys Request+景点+名称 Inform+景点+门票+免费

Error in processing:
[['General', 'greet', 'none', 'none'],
 ['Inform', '景点', '游玩时间', '3小时 - 4小时'],
 ['Inform', '景点', '门票', '免费'],
 ['Request', '景点', '名称', '']]
''
lerry-lee commented 3 years ago

When I tested TemplateNLG alone,

# convlab2/nlg/template/crosswoz/nlg.py
if __name__ == '__main__':
    nlg = TemplateNLG(is_user=False)
    print(nlg.generate([['Inform', '地铁', '目的地', '云峰山'], ['Request', '地铁', '出发地', '']]))

I also got an error:

sys Request+地铁+出发地 Inform+地铁+目的地
sys Request+地铁+出发地 Inform+地铁+目的地

Error in processing:
[['Inform', '地铁', '目的地', '云峰山'], ['Request', '地铁', '出发地', '']]

Can anyone help me? I want to reproduce the dialogue example in the paper, but I am worried that my method is wrong

zqwerty commented 3 years ago

Sorry for the late reply. You install ConvLab-2 in that repo(https://github.com/thu-coai/Convlab-2), right? I will check in few days.

lerry-lee commented 3 years ago

Yes, I should have installed it correctly. Because I use the combination of BertNLU+RuleDST+MLE+TemplateNLG, it can run normally.

# common import: convlab2.$module.$model.$dataset
from convlab2.nlu.jointBERT.crosswoz import BERTNLU
from convlab2.dst.rule.crosswoz.dst import RuleDST
from convlab2.dst.trade.crosswoz.trade import CrossWOZTRADE
from convlab2.policy.rule.crosswoz import Simulator
from convlab2.policy.mle.crosswoz import MLE
from convlab2.nlg.template.crosswoz import TemplateNLG
from convlab2.nlg.sclstm.crosswoz import SCLSTM
from convlab2.dialog_agent import PipelineAgent, BiSession

sys_nlu = BERTNLU()
sys_dst = RuleDST()
# sys_policy = Simulator()
sys_policy = MLE()
sys_nlg = TemplateNLG(is_user=False)
sys_agent = PipelineAgent(sys_nlu, sys_dst, sys_policy, sys_nlg, name='sys')

sys_agent.response("你好,可以帮我推荐一家评分5分以上的景点吗?")
# output
#'您可以带她去红砖美术馆哟,很不错呢。'
zqwerty commented 3 years ago

I see. You should not use Simulator() as policy. Simulator is user policy

lerry-lee commented 3 years ago

Thanks for your reply. So there is no Rule Policy in the source code, right? And I have another problem, when I used the combination of TRADE+MLE Policy+TemplateNLG, it reported an error.

from convlab2.nlu.jointBERT.crosswoz import BERTNLU
from convlab2.dst.rule.crosswoz.dst import RuleDST
from convlab2.dst.trade.crosswoz.trade import CrossWOZTRADE
from convlab2.policy.mle.crosswoz import MLE
from convlab2.nlg.template.crosswoz import TemplateNLG
from convlab2.nlg.sclstm.crosswoz import SCLSTM
from convlab2.dialog_agent import PipelineAgent, BiSession

sys_dst = CrossWOZTRADE()
sys_policy = MLE()
sys_nlg = TemplateNLG(is_user=False)
# sys_nlg = SCLSTM(is_user=False)
sys_agent = PipelineAgent(None, sys_dst, sys_policy, sys_nlg, name='sys')

sys_agent.response("你好,可以帮我推荐一家评分5分以上的景点吗?")

The error message is:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-7d64c0a52c85> in <module>()
----> 1 sys_agent.response("你好,可以帮我推荐一家评分5分以上的景点吗?")

~/code/python/crosswoz&convlab2_gitee/ConvLab-2/convlab2/dialog_agent/agent.py in response(self, observation)
    174         # policy 需要根据之前的 state 和 当前的 input_action ,生成output action
    175         # ------------------------------------
--> 176         self.output_action = self.policy.predict(state)
    177         self.output_action = deepcopy(self.output_action)
    178 

~/code/python/crosswoz&convlab2_gitee/ConvLab-2/convlab2/policy/mle/mle.py in predict(self, state)
     24             action : System act, with the form of (act_type, {slot_name_1: value_1, slot_name_2, value_2, ...})
     25         """
---> 26         s_vec = torch.Tensor(self.vector.state_vectorize(state))
     27         # sys_action_encode_vec
     28         a = self.policy.select_action(s_vec.to(device=DEVICE), False).cpu()

~/code/python/crosswoz&convlab2_gitee/ConvLab-2/convlab2/policy/vector/vector_crosswoz.py in state_vectorize(self, state)
     55         # 每个action 编码为1, 其他的编码为0
     56         da = state['user_action']
---> 57         da = delexicalize_da(da)
     58         usr_act_vec = np.zeros(self.usr_da_dim)
     59         for a in da:

~/code/python/crosswoz&convlab2_gitee/ConvLab-2/convlab2/util/crosswoz/lexicalize.py in delexicalize_da(da)
     11     delexicalized_da = []
     12     counter = {}
---> 13     for intent, domain, slot, value in da:
     14         if intent in ['Inform', 'Recommend']:
     15             key = '+'.join([intent, domain, slot])

ValueError: not enough values to unpack (expected 4, got 1)
zqwerty commented 3 years ago

So there is no Rule Policy in the source code, right?

Yes.

We have not used TRADE for end-to-end evaluation, since it does not produce user action which is important to sys policy.

lerry-lee commented 3 years ago

Thanks! So there is no way to use TRADE to build a dialogue system, right?

zqwerty commented 3 years ago

ConvLab-2 support using TRADE to build a system for MultiWOZ, relying on the rule policy to deduce the user action from the change of belief state

lerry-lee commented 3 years ago

Thank you very much for answering my doubts~