Closed LuckyVicky001 closed 3 years ago
Hi, it's personachat_history3
for the dataset personachat
. "sparse" here corresponds to the data processing options. More specifically, we use sliding windows over the original multi-turn dialogues to create training samples with max-history-length
, and samples of the "sparse" version share no overlap regarding the history turns.
Thx for your reply! I found two new questions about the data:
===>Q1
Yes, we use the next utterance in the original dataset to compute "Continuity". Scripts for building curricula are uploaded now (See projects/adaptive_learning/{scripts, shell/build_data.sh}
for details).
===>Q2
This is the data format requirement for FbDialogTeacher
in ParlAI
, please refer to the class FbDialogTeacher
in parlai/core/teachers.py
for details.
Hi, I downloaded your dataset, and found there are four datasets, which are "data_filtering_data"(dialydialog), "OpenSubtitiles2018_history3_sparse_small" (opensubtitles), "personachat_history3" and "personachat_history3_sparse". There are two named with "personachat", so which one did you used in your paper? Btw, what does "sparse" mean here? Thx