Hi, Alex, Thanks for your public dataset. However, I have some questions. As mentioned by your answer in another issue, there are some different number of multi-document in the corpus, so I want to know the details of the experiment. When you are training, what numbers of documents will you use?
We will use all the documents but truncate to 500 tokens by taking the first tokens from each document. We provide more details in Section 6.3 of the paper.
Hi, Alex, Thanks for your public dataset. However, I have some questions. As mentioned by your answer in another issue, there are some different number of multi-document in the corpus, so I want to know the details of the experiment. When you are training, what numbers of documents will you use?