lxing532 / Dialogue-Topic-Segmenter

Improving Unsupervised Dialogue Topic Segmentation with Utterance-Pair Coherence Scoring
https://aclanthology.org/2021.sigdial-1.18.pdf
58 stars 10 forks source link

Could you please provide Chinese training data samples? #7

Open nihao517 opened 1 year ago

nihao517 commented 1 year ago

Hello, thanks for sharing the code! I am trying to train a Chinese model and have some problems. The code has shown the English training data (text.txt, topic.txt, act.txt). However, the Chinese dataset(NaturalConv) does not have the same format as the English dataset(DailyDialog). I don't know how to generate Chinese training data (NaturalConv_text.txt, NaturalConv_topic.txt, NaturalConv_act.txt). Could you please provide Chinese training data samples? Thanks for your reply!