IBM / multidoc2dial

MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents
Apache License 2.0
67 stars 22 forks source link

Sharing unseen-domain data #13

Open YiweiJiang2015 opened 2 years ago

YiweiJiang2015 commented 2 years ago

Thanks for sharing the test data which includes the seen-domain data. Is there any plan to release the unseen-domain part as well?

songfeng commented 2 years ago

@sivasankalpp could be the better person to answer this question. If you do want to test on an unseen domain, you can do so by removing the data of the target domain from training data.

YiweiJiang2015 commented 2 years ago

Hey Song, thanks for forwarding. By unseen data, I refer to the covid-related documents and conversations. Adding it will increase the completeness of the dataset.