tldr-pages / tldr-python-client

:snake: Python command-line client for tldr pages 📚
https://pypi.org/project/tldr/
MIT License
603 stars 95 forks source link

Fetch individual translation archives for cache based on the environment variable configuration #217

Closed kbdharun closed 9 months ago

kbdharun commented 1 year ago

As specified in Client Specification 2.0

If appropriate, it is RECOMMENDED that clients implement a cache of pages. If implemented, clients MUST download the entire archive either as a whole from https://tldr.sh/assets/tldr.zip (Which redirects to https://raw.githubusercontent.com/tldr-pages/tldr-pages.github.io/main/assets/tldr.zip) or download language-specific translation archives in the format https://tldr.sh/assets/tldr-pages.{{language-code}}.zip (Which redirects to https://raw.githubusercontent.com/tldr-pages/tldr-pages.github.io/main/assets/tldr-pages.{{language-code}}.zip), along with the archive for English from https://tldr.sh/assets/tldr-pages.zip (It redirects to https://raw.githubusercontent.com/tldr-pages/tldr-pages.github.io/main/assets/tldr-pages.zip).

Caching SHOULD be done according to the user's language configuration (if any), to not waste unneeded space for unused languages. Additionally, clients MAY automatically update the cache regularly.

So, we need to update the client behaviour to work with fetching only the specified language's archive when fetching based on configuration or via the tldr <page> --language <language-code> command, we can include English by default as a fallback.

SaurabhDRao commented 1 year ago

Hello! Can I take this up ?

Below are the changes that needs to be done right ?

kbdharun commented 1 year ago

Hello! Can I take this up ?

Sure

Below are the changes that needs to be done right ?

  • if a language is set, then fetch only the data for that particular language from the respective language archive location.
  • if no language is set, then fetch the english archive as default.

Yes, if a language is set (in the environment variable) only the language and English translation archive must be fetched and when a page isn't translated the client needs to fall back to the English page.

SaurabhDRao commented 1 year ago

I have added change for pulling the language specific archives into cache. (#218)

But I noticed that the fallback on en is already present. In the get_page function, there is a call to get a list of all languages where it is attaching the default language (en) at the end. Let me know if I am missing something.