Closed wutong4012 closed 2 years ago
Hello, looking for the dataset? you can try to clone this project and use the script to download
git clone https://github.com/IBM/multidoc2dial.git
cd multidoc2dial/scripts
./run_download.sh
or I am available to provide it for you for free, just downloaded last night. If you need, just let me know and provide your email address.
I find some noisies in this dataset, which could be the reason they canceled the way to download their dataset, the id of the docs are not unique, two docs with different id may have the same content, which is probably caused by their Crawler program.
Hello, looking for the dataset? you can try to clone this project and use the script to download
git clone https://github.com/IBM/multidoc2dial.git cd multidoc2dial/scripts ./run_download.sh
or I am available to provide it for you for free, just downloaded last night. If you need, just let me know and provide your email address.
Thank you very much for your help, my email is wt1102310705@gmail.com. Regarding the problem of data noise, I will study it in detail later.
Sorry, the website was temporarily down. Now it is up again.
You can also find the dataset at https://github.com/doc2dial/multidoc2dial/tree/main/file .
It seems that the link 404 cannot download the data. Where can I download the data?