themoeway / yomitan

Pop-up dictionary browser extension. Successor to Yomichan.
https://yomitan.wiki
GNU General Public License v3.0
1.19k stars 94 forks source link

Download/Update Dictionaries From Within Yomitan #926

Closed MarvNC closed 2 months ago

MarvNC commented 4 months ago

It'd be cool if there were a feature to update dictionaries from within Yomitan, maybe by just providing a github URL.

What I've had in mind is the github repo would have a json in the root directory listing each dictionary available and a corresponding regex that can be applied to find the correct release filename from the latest release and to download that. For example with jmdict-yomitan there could be a few options:

[
  {
    "name": "JMnedict",
    "includedNameRegex": "/JMnedict/"
  },
  {
    "name": "KANJIDIC (English)",
    "includedNameRegex": "/KANJIDIC_english/"
  }
]

Then the default behavior could be to update the dictionary once a week(month?) in the background or something. This would be useful since there are a bunch of regularly updated Yomitan dictionaries that are distributed via GitHub.

jamesmaa commented 4 months ago

I feel like if you have to go fetch a github url for the dictionary, you've already lost many non-technical users and undermine the main benefit trying to download/update dictionaries from within yomitan. I would rather these URL's be shipped with yomitan rather than giving people a potential footgun.

MarvNC commented 4 months ago

Yeah, you could have the free/open ones be built in and use that mechanism. I feel like that would be a separate issue.

StefanVukovic99 commented 4 months ago

I have some ideas about an approach to this.

  1. Updating - For a dictionary to update, we can require that its index.json's url field contains the url to the latest version of the dictionary, and it's revision field should be CalVer or SemVer. We would update the descriptions in the dictionary-index-schema.json schema to explain this. To check for updates on a single dictionary, we should be able to fetch it from the url and check if it's revision is newer than the current.

  2. Downloading - If I understand correctly, to update a dict we need to first delete the current version, download the new and import it. Would be good if we fix some of the importing/deleting issues.

  3. "Dictionary Store" - To enable choosing new dictionaries from within yomitan, we define a new json schema which is just an array of objects, where each object matches dictionary-index-schema.json. Each dictionary provider (ie. the repos for Jitendex, KTY, etc) can be required to have a file in this format, listing the dicts they offer. We can verify a few providers and include links to their lists in the default settings, allowing adding/removing to this whitelist. The "Store" popup in the settings can fetch lists from all of the current providers, combine them, and show an index/table with searching/filtering/sorting by language, provider etc.

  4. Bulk Update - fetch the providers and check the revisions across the whole list.

  5. Auto add new provider - Add a link in dictionary-index-schema.json that points to the provider's list. When a dictionary is imported manually, and it's link to the provider is not yet in the user's whitelist, allow adding it.

MarvNC commented 4 months ago

Updating - For a dictionary to update, we can require that its index.json's url field contains the url to the latest version of the dictionary, and it's revision field should be CalVer or SemVer. We would update the descriptions in the dictionary-index-schema.json schema to explain this. To check for updates on a single dictionary, we should be able to fetch it from the url and check if it's revision is newer than the current.

I support adding semver/dates to dictionary metadata. I guess it would be duplicated to the dictionary store containing the dictionary indexes? We would need to figure out how to differentiate dictionaries like Jitendex and others that contain a date string in the title (maybe with regex like in the OP, or a new field). Or maybe make it so the title is displayed along with the version number in the popup so we don't need to put the date in the title.

to update a dict we need to first delete the current version, download the new and import it

Yeah, seems like a no-brainer especially if semver is implemented.

jamesmaa commented 4 months ago

I think we should separate solving downloading a dict from updating one. There are no blockers on why they have to go together. The conversation here has mostly revolved around updating dictionaries but I think solving for downloading dictionaries would be easier and safer since there are no schema changes or trying to get our partners (jitendex, etc) to opt in with the schema and stuff

brishtibheja commented 4 months ago

Well it's not like languages drastically change every few years that a new dictionary is an abosulte must. The needful is being able to download the dictionaries easily. Updating is cherry-on-top.