Open eor51355 opened 1 month ago
Thank you I will try this out!
On Tue, Jun 11, 2024 at 3:55 AM Vincent Lau @.***> wrote:
I add a "streaming=True" after the name of the dataset, and it works.....hope it can help you
And if you install the version datasets==2.15.0, this bug will not happen. I don't know why, but all of them works
— Reply to this email directly, view it on GitHub https://github.com/huggingface/datasets/issues/6906#issuecomment-2160041812, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3HXU7AMBT2MNO34SC3Z5G3ZG2UOXAVCNFSM6AAAAABH45CNPWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRQGA2DCOBRGI . You are receiving this because you authored the thread.Message ID: @.***>
I still find out that there are some strange bug in v2.15.0 of datasets. it seems like that the *.arrow file cannot be established. it may be an index of the subsets. well I still try to debug it. but, one of the most efficient way may be using the google colab to build this index in the ~/huggingface/datasets, and than download them to replace the local file.....lol......it works!
Yeah I did try what you suggested and it didn’t work. I was able to get it on a local from someone who access the dataset in the past. Let me know when you end up fixing this bug.
On Tue, Jun 11, 2024 at 10:33 PM Vincent Lau @.***> wrote:
I still find out that there are some strange bug in v2.15.0 of datasets. it seems like that the *.arrow file cannot be established. it may be an index of the subsets. well I still try to debug it. but, one of the most efficient way may be using the google colab to build this index in the ~/huggingface/datasets, and than download them to replace the local file.....lol......it works!
— Reply to this email directly, view it on GitHub https://github.com/huggingface/datasets/issues/6906#issuecomment-2161988798, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3HXU7BCJE2LOCWRVWPMNODZG6XPJAVCNFSM6AAAAABH45CNPWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRRHE4DQNZZHA . You are receiving this because you authored the thread.Message ID: @.***>
Describe the bug
I am trying to access your database through python using "datasets.load_dataset("irc_disentangle")" and I am getting this error message:
ValueError: Instruction "train" corresponds to no data!
Steps to reproduce the bug
import datasets ds = datasets.load_dataset('irc_disentangle') ds
Expected behavior
The data is supposed to load into ds and be accessable as such: ds['train'][1050], ds['train'][1055]
Environment info
I tired Python 3.12 and 3.10