Which dataset did you used for personachat in your paper?

hengyicai / Adaptive_Multi-curricula_Learning_for_Dialog

The codebase for "Learning from Easy to Complex: Adaptive Multi-curricula Learning for Neural Dialogue Generation" (Cai et al., AAAI 2020)

MIT License

19 stars 3 forks source link

Which dataset did you used for personachat in your paper? #1

Closed LuckyVicky001 closed 3 years ago

LuckyVicky001 commented 3 years ago

Hi, I downloaded your dataset, and found there are four datasets, which are "data_filtering_data"(dialydialog), "OpenSubtitiles2018_history3_sparse_small" (opensubtitles), "personachat_history3" and "personachat_history3_sparse". There are two named with "personachat", so which one did you used in your paper? Btw, what does "sparse" mean here? Thx

hengyicai commented 3 years ago

Hi, it's personachat_history3 for the dataset personachat. "sparse" here corresponds to the data processing options. More specifically, we use sliding windows over the original multi-turn dialogues to create training samples with max-history-length, and samples of the "sparse" version share no overlap regarding the history turns.

LuckyVicky001 commented 3 years ago

Thx for your reply! I found two new questions about the data:

how did you calculate the "Continuity"? Since your train.txt looks like "context \t response", and there is no "subsequent utterance u" described in your paper. Does it mean the next utterance in the original dataset (I mean, before you use max-history-length to cut a conversation session in the original dataset?) Maybe, it's better to share the script you used to calculate these scores.
In the dataset, each line is started with a "1", what does it mean? And I found you did not discard it in context. Thx again!

hengyicai commented 3 years ago

===>Q1 Yes, we use the next utterance in the original dataset to compute "Continuity". Scripts for building curricula are uploaded now (See projects/adaptive_learning/{scripts, shell/build_data.sh} for details).

===>Q2 This is the data format requirement for FbDialogTeacher in ParlAI, please refer to the class FbDialogTeacher in parlai/core/teachers.py for details.