Closed cathyxl closed 2 years ago
Hi @cathyxl, thanks for reporting.
Indeed, we have recently updated the loading script of that dataset (and fixed that bug as well):
That fix will be available in our next datasets
library release. In the meantime, you can incorporate that fix by:
datasets
from our GitHub repo:
pip install git+https://github.com/huggingface/datasets#egg=datasets
ds = load_dataset('conll2012_ontonotesv5', 'english_v4', split="test", download_mode="force_redownload")
Feel free to re-open this issue if the problem persists.
Describe the bug
Cannot load the dataset conll2012_ontonotesv5
Steps to reproduce the bug
Expected results
The datasets should be downloaded successfully
Actual results
raise NonMatchingChecksumError(error_msg + str(bad_urls)) datasets.utils.info_utils.NonMatchingChecksumError: Checksums didn't match for dataset source files: ['https://md-datasets-cache-zipfiles-prod.s3.eu-west-1.amazonaws.com/zmycy7t9h9-1.zip']
Environment info
datasets
version: 2.0.0