WorksApplications / SudachiDict

A lexicon for Sudachi
233 stars 19 forks source link

Error 401 when downloading dictionnaries from an old verison #25

Closed lafeuil closed 3 years ago

lafeuil commented 3 years ago

When I install the sudachidict-core library to the 20200722 version with pip, I have the following error :

Downloading the Sudachi dictionary (It may take a while) ...
[...]
 raise HTTPError(req.full_url, code, msg, hdrs, fp)
    urllib.error.HTTPError: HTTP Error 401: Unauthorized

With the 20200722 version, the dictionary is downloaded from this url : https://object-storage.tyo2.conoha.io/v1/nc_2520839e1f9641b08211a5c85243124a/sudachi/sudachi-dictionary-20200722-core.zip

But all files from this storage return a HTTP 401 error.

With the new 20201223 version, the dictionaries are migrated to an S3 storage. So, what is the status of the old storage ? We must upgrade to the new 20201223 version ? Is it a temporary issue ?

sorami commented 3 years ago

Hi!

(I used to work for the Sudachi dev company but now I am an outside collaborator)

The Sudachi resources used to be hosted on ConoHa, an online storage service, but they have been moved to the AWS S3 with their Oepn Data Sponsorship Program since October 2020 (cf. Sudachi Language Resources - Registry of Open Data on AWS).

All the data (including old versions) are now on S3, but they had data on ConoHa for a while as well; I guess they now terminated it.

You can download the file from S3 directly, e.g., https://sudachi.s3-ap-northeast-1.amazonaws.com/sudachidict/sudachi-dictionary-20200722-core.zip (full list here).

The sudachidict-* libraries has the resource URL hardcoded (in setup.py), and for version 20200722 or before the URLs are still pointing to the old ConoHa. e.g., https://github.com/WorksApplications/SudachiDict/blob/v20200722/python/setup.py#L29

@kazuma-t @chikurin66 We cannnot overwrite the pip versions on PyPI, so maybe you can have updated versions like 20200722.s3 or something like that with the current S3 URLs? https://pypi.org/project/SudachiDict-core/#history

sorami commented 3 years ago

By the way, any reason you specifically want to use the version 20200722 and not the latest one (I'm just curious)?

lafeuil commented 3 years ago

I'm not opposed to upgrade the version for the new release of my library. But when I use an old version of my library that references the 20200722 version of sudachidict library in its dependencies, I obtain the previous errors. The url change breaks the installation with the old versions of my library.

kazuma-t commented 3 years ago

I've released 20200722.post1 with the fixed URL. Please use it.