thu-coai / CrossWOZ

A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset
Apache License 2.0
645 stars 114 forks source link

data_load的代码与数据格式不符 #10

Closed renmada closed 4 years ago

renmada commented 4 years ago

for d in self.data[data_key]: max_sen_len = max(max_sen_len, len(d[0])) sen_len.append(len(d[0]))

d = (tokens, tags, intents, da2triples(turn["dialog_act"], context(list of str))

        if cut_sen_len > 0:
            d[0] = d[0][:cut_sen_len]
            d[1] = d[1][:cut_sen_len]
            d[4] = [' '.join(s.split()[:cut_sen_len]) for s in d[4]]

数据是key, value的形式,d不是key吗

zqwerty commented 4 years ago

需要先运行 CrossWOZ/convlab2/nlu/jointBERT/crosswoz/preprocess.py。dataloader使用的是预处理好的数据