SEACrowd / seacrowd-datahub

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
Apache License 2.0
60 stars 56 forks source link

Closes #438 | Add dataloader for ASR-INDOCSC #509

Closed zwenyu closed 4 months ago

zwenyu commented 5 months ago

Closes #438

Checkbox

holylovenia commented 5 months ago

A friendly reminder for @khelli07 to review.

khelli07 commented 5 months ago

Hi, apparently the data can not be downloaded from my side. Seems like the server is down (?) or the pre-signed URL is invalid (?) image

I have already signed up and signed in to the platform. I have questions though. Do we have to log in to download? If yes, please include it in the description.

ljvmiranda921 commented 5 months ago

Oh wow that's interesting. It did work the last time I checked this. Hmm, I guess we should make the file LOCAL instead? And then instruct the user to manually download the file and put it in some directory?

zwenyu commented 5 months ago

@khelli07 Yes need to create an account and be logged in, I'll add this in the description. Are you able to download the dataset by running the dataloader script? I'm now getting the same access error when trying to download through here or manually through the dataset landing page, but is still able to download and load the dataset through the script. Tried this after deleting the previous cached copy.

holylovenia commented 5 months ago

@khelli07 Yes need to create an account and be logged in, I'll add this in the description. Are you able to download the dataset by running the dataloader script? I'm now getting the same access error when trying to download through here or manually through the dataset landing page, but is still able to download and load the dataset through the script. Tried this after deleting the previous cached copy.

I managed to download the data from here, @zwenyu. Probably it's a temporary issue?

cc: @khelli07 @ljvmiranda921

khelli07 commented 5 months ago

Just ran it again, and it works! Let me take a look a bit more and wrap up the review :)