Open cyanic-selkie opened 1 year ago
Hi! The link pointing to the code that generated the dataset is broken. Can you please fix it to make debugging easier?
Hi! The link pointing to the code that generated the dataset is broken. Can you please fix it to make debugging easier?
Sorry about that, it's fixed now.
@cyanic-selkie could you explain how you fixed it? I met the same error in loading other datasets, is it due to the version of the library enviroment?
@MingsYang I never fixed it. If you're referring to my comment above, I only meant I fixed the link to my code.
Anyway, I managed to work around the issue by using streaming
when loading the dataset.
@cyanic-selkie Emm, I get it. I just tried to use a new version python enviroment, and it show no errors anymore.
Upgrade pyarrow to the latest version solves this problem in my case.
Describe the bug
When loading the dataset wikianc-en which I created using this code, I get the following error:
This only happens when I load the
train
split, indicating that the size of the dataset is the deciding factor.Steps to reproduce the bug
Expected behavior
The dataset should load normally without any errors.
Environment info
datasets
version: 2.10.1