Something's wrong with weibo's preprocessing

RowitZou / CG-nAR

EMNLP-2021 paper: Thinking Clearly, Talking Fast: Concept-Guided Non-Autoregressive Generation for Open-Domain Dialogue Systems.

MIT License

18 stars 1 forks source link

After preprocessing raw data to json data, I got some .json documents with following contents:

The problem is that there are no elements for "source_entity", "target_entity", "triples" and "context_keywords_list". I wonder why that happened because there's nothing wrong with persona's preprocessing.

The reason may be that there are no correct graph-related data in "graph_data/weibo/". We need "vertex.txt", "adj_matrix.txt" and "weibo_graph_embedding.npy" in total before preprocessing raw data to json data. You can try running the steps in "weibo_preprocess.sh" step by step to make sure that all graph-related data is properly stored in this directory.

RowitZou / CG-nAR

Something's wrong with weibo's preprocessing #9