Closed rickiepark closed 5 years ago
Did you attempt to go to the link in your browser? I'm talking with that team to try to make sure it's downloadable in the browser in the future. For now that page should give you the download instructions:
Dataset: NewsQA
Note that the dataset download link cannot be used directly in a browser
How do I use a download link for an entire dataset?
A download link for an entire dataset provides the location of the dataset in Azure as well as a special time-limited key that allows you to download the entire dataset. This link can be used with tools that can copy files from Azure, like the following:
AzCopy - a command-line tool for Windows or Linux that copies files to and from Azure.
Azure Storage Explorer - a utility that is used to manage Azure storage.
<Then you get a URL for the dataset which will not work in the browser>
For now, to get the dataset, get AzCopy and run:
azcopy cp --recursive <put that URL from before here> downloaded_newsqa
cp downloaded_newsqa/newsqa/newsqa-data-v1.csv ~/workspace/newsqa/maluuba/newsqa
The rest of the setup instructions should work (mostly) fine, except when using Docker. I'll update the setup instructions if the dataset download doesn't get changed soon. If you're using the Docker container:
# Notice that I am giving a specific command to not let the default command run because the current default command would delete newsqa-data-v1.csv.
docker run --rm -it -v ${PWD}:/usr/src/newsqa --name newsqa maluuba/newsqa /bin/bash --login -c "cp --no-clobber /usr/downloads/* maluuba/newsqa/ && python -m unittest discover ."
Thank you @juharris ,
I misunderstood the instructions. :( I'll try again with azcopy.
Thank you so much for quick response. 👍
No worries. I actually did the same thing and ignored their instructions the first time too =)
I successfully downloaded the dataset in azure vm. Thank you so much. :)
Download link for the entire dataset in https://msropendata.com/datasets/939b1042-6402-4697-9c15-7a28de7e1321 returns error message like below. Please help me...
AuthenticationFailed