google-research-datasets / wikifact

Wikipedia based dataset to train relationship classifiers and fact extraction models
24 stars 1 forks source link

Can't download the data for fact extraction #1

Open emayecs opened 3 years ago

emayecs commented 3 years ago

When I run the commands to download the datasets for fact extraction, I get a 404 error.

Command: wget -c https://storage.googleapis.com/gresearch/wikifact_ds/fact_extraction_paragraph/fact_extraction_paragraphdev-00000-of-00001

Error: --2021-07-13 12:26:47-- https://storage.googleapis.com/gresearch/wikifact_ds/fact_extraction_paragraph/fact_extraction_paragraphdev-00000-of-00001 Resolving storage.googleapis.com (storage.googleapis.com)... 172.217.12.144, 172.217.11.16, 172.217.165.144, ... Connecting to storage.googleapis.com (storage.googleapis.com)|172.217.12.144|:443... connected. HTTP request sent, awaiting response... 404 Not Found 2021-07-13 12:26:47 ERROR 404: Not Found.

I get the same error from downloading the datasets for both the sentence and paragraph fact extraction. However, all the files for the classifier and the subword text encoder are downloaded successfully.

guidovranken commented 2 years ago

As a workaround you can install gsutil (https://cloud.google.com/storage/docs/gsutil_install) and use it to download the files, e.g.:

gsutil cp gs://gresearch/wikifact_ds/fact_extraction_paragraph/* .
shailender1 commented 2 years ago

Cant download the dataset with gsutil ..keeps retrying ..any other suggestions ?? thanks.