I have read the ED article, the train/valid/test conversations is 19533 / 2770 / 2547. But the train file is 16.9M. the valid, test file is about 36M. I want to know what make the contradiction between the conversations number and the file size. Any response will help me a lot.
I have read the ED article, the train/valid/test conversations is 19533 / 2770 / 2547. But the train file is 16.9M. the valid, test file is about 36M. I want to know what make the contradiction between the conversations number and the file size. Any response will help me a lot.