Closed qichaotang closed 5 years ago
(1/2) To generate the words list in "preprocess/candi_keyword.txt", we first converted all the verbs, nouns and adjectives to the basic form by WordNet, and then deleted the words occurring less than 11 times. We also deleted some words which are unsuitable as a target (e.g. have/haha).
(3) Our model has not been tested on any Chinese dataset yet. In my own opinion, conversation quality and keywords selection are important than the language type.
了解了,tks
请问下为什么 all_none_original_no_cands.txt 与 candi_keyword.txt 能够很好的适配起来?这些keyword candi 是如何产生的?还有请问下您试过中文数据集上面的效果么?