DailyDialog has been reformatted to speaker A and speaker B on a new line as discussed in meeting #42
A: in my wedding ceremony, where do my parents sit in the church?
B: the bride's parents ' seating arrangement is on the left side of the aisle and the groom's parents is on the right side.
A: do friends of the bride always sit on one side of the church and friends of the groom on the other?
The human completions have also gotten a speaker label (which changes according to the last speaker in the original conversation (source col) for the particular row):
B: they usually do.
Inspection of other datasets
Other datasets have been inspected and some problems have been identified that will be solved shortly:
stories: <newlines> will be removed as mentioned in #32 (but not urgent as the source column is fine and that is what the models depend on)
dailymail_cnn: possibly will be cleaned more as mentioned in #44
Data Cleaning
DailyDialog
DailyDialog has been reformatted to speaker A and speaker B on a new line as discussed in meeting #42
The
human completions
have also gotten a speaker label (which changes according to the last speaker in the original conversation (source
col) for the particular row):Inspection of other datasets
Other datasets have been inspected and some problems have been identified that will be solved shortly:
stories
:<newlines>
will be removed as mentioned in #32 (but not urgent as the source column is fine and that is what the models depend on)dailymail_cnn
: possibly will be cleaned more as mentioned in #44(point 2 is also related to #11)