tldr-pages / tldr

📚 Collaborative cheatsheets for console commands
https://tldr.sh
Other
50.36k stars 4.11k forks source link

Offer `pages` as `pages.en` #11121

Closed niklasmohrin closed 10 months ago

niklasmohrin commented 10 months ago

I am currently in the process of implementing language specific caching for tealdeer (https://github.com/dbrgn/tealdeer/issues/335) and came across the relevant paragraph from the spec

If implemented, clients MUST download the entire archive [...] or download language-specific translation archives in the format https://tldr.sh/assets/tldr-pages.{{language-code}}.zip [...], along with the archive for English from https://tldr.sh/assets/tldr-pages.zip [...].

The special treatment of English makes programming a bit awkward, because you have to take care of it everywhere (downloading, extracting, lookup, listing, ...). I think it would be better if the only place where English gets special treatment is in the "decide which languages to try" phase and afterwards it is just a language like every other. (In tealdeer, we just pass around the string "en").

I can understand if you don't want to move pages to pages.en in the repository here (at this time), but I think it would be the right move in the long term. For now, I would like to ask that at least the "public" interface tldr.sh/assets gets rid of the special case English and offers tldr-pages.en.zip as a redirect to tldr-pages.zip. Next, I would rephrase the spec to not mention English in the caching section and just have it say something like "download the needed languages as defined by the Languages section".

(If you ever decide to rename pages, have you considered moving pages.de into pages/de etc.? This would make the root directory of the repository a bit cleaner :D)

What do you think?

gutjuri commented 10 months ago

Good point. Maybe we could create a directory pages.en, move all english pages there, and then have a symlink from pages to pages.en.

kbdharun commented 10 months ago

Good point. Maybe we could create a directory pages.en, move all english pages there, and then have a symlink from pages to pages.en.

That is exactly what I thought too but the other way around. Currently a bit busy, will get back later today to discuss it in detail.

kbdharun commented 10 months ago

Good point. Maybe we could create a directory pages.en, move all english pages there, and then have a symlink from pages to pages.en.

That is exactly what I thought too but the other way around. Currently a bit busy, will get back later today to discuss it in detail.

Back again :)

I can understand if you don't want to move pages to pages.en in the repository here (at this time), but I think it would be the right move in the long term. For now, I would like to ask that at least the "public" interface tldr.sh/assets gets rid of the special case English and offers tldr-pages.en.zip as a redirect to tldr-pages.zip. Next, I would rephrase the spec to not mention English in the caching section and just have it say something like "download the needed languages as defined by the Languages section".

Agreed, we can indeed upload the English artifact as tldr-pages.en.zip [will work on a PR for it] along with the old artifact tldr-pages.zip for compatibility with the legacy clients as redirecting with GitHub pages wouldn't be feasible as the assets aren't static so there might be some issues resolving it (which might add up some time to fetching).

Regarding, the pages directory we can continue using it but I will add a symlink pages.en for it so that we don't need to move all the pages to a new directory and potentially break clients.

If you ever decide to rename pages, have you considered moving pages.de into pages/de etc.? This would make the root directory of the repository a bit cleaner :D

I think it would be highly unlikely a page directory rename would take place in the near future (other than the osx to macos change). But if we do I will remember your suggestion to group all languages inside a single directory.


On a side note, the reason why I mentioned multiple times any major architectural change like directory renaming, etc would break clients is because I have had first-hand experience (in my 1 year of maintaining tldr) with it when we were working on dropping the master branch and moving clients to main we almost had 50+ clients that would go borked after this change. We had to coordinate and get in touch with a lot of maintainers (some of them weren't fine with us dropping the old default branch) over the course of months we eventually brought down the list of broken clients (now it is 3-4 I think) but it took a lot of time for myself and pixelcmtd (a very dedicated maintainer) to manually audit them and open PRs if they used the now removed branch. And this was after 1 year of announcing that we would deprecate the master branch.

acuteenvy commented 10 months ago

It doesn't work as expected. tldr-pages.zip is still the English archive, but tldr-pages.en.zip becomes a text file containing the string tldr-pages.zip instead of the actual archive.


I guess the only solution then is to copy tldr-pages.zip => tldr-pages.en.zip instead of symlinking it. Unless someone comes up with something more appropriate.

kbdharun commented 10 months ago

I guess the only solution then is to copy tldr-pages.zip => tldr-pages.en.zip instead of symlinking it. Unless someone comes up with something more appropriate.

Unfortunately yeah, for now, we can just continue having both the files generated separately.

acuteenvy commented 10 months ago

for now, we can just continue having both the files generated separately

What do you mean? You can generate it once and copy it.

kbdharun commented 10 months ago

for now, we can just continue having both the files generated separately

What do you mean? You can generate it once and copy it.

Yep, that works too (as it isn't symlinked).