Open Rabona17 opened 3 years ago
I don't have the spacefusion pre-training code. On dailydialog dataset, we keep the history of a fixed sequence length. We tried to follow the original paper setting:
Thanks, so where can I get the daily dialog dataset you used in run_dialog_spacefusion.sh (../data/datasets/dailydialog_data/train.txt)? Or should I preprocess it myself?
I'm afraid you have to pre-process it on your own.
Sure, so for DailyDialog, since spacefusion doesn't provide any preprocessing code for the dataset, what criteria did you use for src and trgt, or what procedure did you use to split the original dailydialog in to src and trgt? Thanks in advance!
Where can I get the preprocessed dailydialog dataset used for spacefusion pretraining code? Any suggestion on how to preprocess the original dailydialog would be appreciated! Thanks