huggingface / course

The Hugging Face course on Transformers
https://huggingface.co/course
Apache License 2.0
2.15k stars 706 forks source link

Chapter 5:4, Big Data download issue #606

Open antoineross opened 1 year ago

antoineross commented 1 year ago

We get an error: "DatasetGenerationError: An error occurred while generating the dataset", when running the following code:

This takes a few minutes to run, so go grab a tea or coffee while you wait :)

data_files = "https://the-eye.eu/public/AI/pile_preliminary_components/PUBMED_title_abstracts_2019_baseline.jsonl.zst" pubmed_dataset = load_dataset("json", data_files=data_files, split="train") pubmed_dataset

thomasshin commented 12 months ago

I could not find the Pile dataset..