reazon-research / ReazonSpeech

Massive open Japanese speech corpus
https://research.reazon.jp/projects/ReazonSpeech/
Apache License 2.0
239 stars 18 forks source link

It seems like the data hosting server is currently down. #27

Closed asahi417 closed 6 months ago

asahi417 commented 7 months ago

Hi, thanks for sharing such a useful dataset! We've been using the reazonspeech dataset via huggingface https://huggingface.co/datasets/reazon-research/reazonspeech, but it seems that since yesterday, the hosting server https://reazonspeech.s3.abci.ai has been down and cannot load the dataset anymore. You can reproduce the issue by simply running the following command,

from datasets import load_dataset
ds = load_dataset("reazon-research/reazonspeech", "tiny", trust_remote_code=True)

that raises the following erorr.

ConnectionError: Couldn't reach https://reazonspeech.s3.abci.ai/v2-tsv/tiny.tsv (ConnectionError(MaxRetryError("HTTPSConnectionPool(host='reazonspeech.s3.abci.ai', port=443): Max retries exceeded with url: /v2-tsv/tiny.tsv (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x2b8cfd2777f0>: Failed to establish a new connection: [Errno 111] Connection refused'))")))

I've tried to download the dataset on a couple of different servers, and none of them worked, so I assume it's a global issue affecting all the users trying to download the dataset. I'd appreciate if you could inspect it and make the dataset live again soon.

Thank you!!

fujimotos commented 7 months ago

We've been using the reazonspeech dataset via huggingface

@asahi417 Yes, it's not accessible right now. Please go look at the schedule table below:

https://abci.ai/en/about_abci/info.html

The background of this issue is:

The service is scheduled to be up again at 15:00 JST on April 12. So please wait for the service status update from AIST.

fujimotos commented 6 months ago

The hosting server is up again. I think I can close this ticket now.