tldr-pages / tldr-python-client

Python command-line client for tldr pages
https://pypi.org/project/tldr/
MIT License
588 stars 92 forks source link

`tldr --update_cache` does not delete local-only pages #242

Open mooreye opened 3 months ago

mooreye commented 3 months ago

How to reproduce on Linux:

  1. Download all pages by running tldr -u
  2. Go to ~/.cache/tldr/pages/linux/ dir
  3. Create a test file: cp cp.md zzzzz.md
  4. Running tldr zzzzz now shows the copied cp page
  5. Run tldr -u again
  6. See that the file zzzzz.md is still present.
mooreye commented 3 months ago

diff-ing from my backup shows for example that the following pages were once present but were deleted upstream:

[/path/to/my/backup/of/home/dir/.cache/] $ diff -r tldr/ ~/.cache/tldr/
Only in tldr/pages/common: amass-db.md
Only in tldr/pages/common: sockstat.md
Only in tldr/pages/linux: ffuf.md
Only in tldr/pages/linux: flock.md
Only in tldr/pages/linux: select.md
Only in tldr/pages/linux: zipcloak.md
vitorhcl commented 3 months ago

Hi, thank you for opening this issue!

This repository is just for the pages, specification for tldr clients (programs to visualize tldr pages), scripts for updating pages... But not for tldr clients. So your proposal would be for enforcing this to all clients.

Well, I see some pros and cons of not allowing local-only pages:

We have to discuss if the pros would outweigh the cons on doing this (or if there is another unseen pro or con, really).

diff-ing from my backup shows for example that the following pages were once present but were deleted upstream:

[/path/to/my/backup/of/home/dir/.cache/] $ diff -r tldr/ ~/.cache/tldr/
Only in tldr/pages/common: amass-db.md
Only in tldr/pages/common: sockstat.md
Only in tldr/pages/linux: ffuf.md
Only in tldr/pages/linux: flock.md
Only in tldr/pages/linux: select.md
Only in tldr/pages/linux: zipcloak.md

PS: I don't really see how the comment above is related with the original issue

gutjuri commented 3 months ago

How to reproduce on Linux:

1. Download all pages by running `tldr -u`

2. Go to `~/.cache/tldr/pages/linux/` dir

3. Create a test file: `cp cp.md zzzzz.md`

4. Running `tldr zzzzz` now shows the copied `cp` page

5. Run `tldr -u` again

6. See that the file `zzzzz.md` is still present.

Which tldr client do you use? I.e. how did you install it?

gutjuri commented 3 months ago

Hi, thank you for opening this issue!

This repository is just for the pages, specification for tldr clients (programs to visualize tldr pages), scripts for updating pages... But not for tldr clients. So your proposal would be for enforcing this to all clients.

Well, I see some pros and cons of not allowing local-only pages:

* Pros:

  1. Standard environment across clients.
  2. Encourages the user to submit a pull request for creating a new page.
  3. If page is added on the repository and there was already a local-only page with the same name, it would be likely overwritten, so not allowing them would avoid this situation.

* Cons:

  1. Local-only pages could give the user the possibility of mantain pages that don't fit the project's goal (e.g. documenting file formats, mixed utilities, code functions or anything desired).
  2. Increases the complexity for all clients for implementing the cache handling system.

We have to discuss if the pros would outweigh the cons on doing this (or if there is another unseen pro or con, really).

diff-ing from my backup shows for example that the following pages were once present but were deleted upstream:

[/path/to/my/backup/of/home/dir/.cache/] $ diff -r tldr/ ~/.cache/tldr/
Only in tldr/pages/common: amass-db.md
Only in tldr/pages/common: sockstat.md
Only in tldr/pages/linux: ffuf.md
Only in tldr/pages/linux: flock.md
Only in tldr/pages/linux: select.md
Only in tldr/pages/linux: zipcloak.md

PS: I don't really see how the comment above is related with the original issue

I do not think that mooreye is proposing to force clients to disallow local-only pages (which would be next to impossible anyways). They raised the issue that pages that are deleted from this repository are not deleted locally by the client they are using. For example, the ffuf page was first added in linux/ (https://github.com/tldr-pages/tldr/pull/6183), then it was added in common/ (https://github.com/tldr-pages/tldr/pull/12082). Eventually, the version in linux/ was deleted (https://github.com/tldr-pages/tldr/pull/12592). However, the client mooreye is using does not perform this deletion of linux/ffuf.md, so linux/ffuf.md keeps existing on mooreye's computer even though it has been deleted in this repository (mooreye, please correct me if I misunderstood the issue at hand).

I think that this is not an issue with the client spec, because the client spec uses the term "cache" and updating a cache implicates deleting outdated entries IMO. If I understood correctly, this is an issue with the specific tldr client used by mooreye.

mooreye commented 3 months ago

I use Fedora's DNF package, which is now v3.2.0

Their upstream info points to https://github.com/tldr-pages/tldr-python-client

gutjuri commented 3 months ago

I use Fedora's DNF package, which is now v3.2.0

Their upstream info points to https://github.com/tldr-pages/tldr-python-client

Thanks! I transferred the issue into the python-client's repo.

Also, I have spotted the cause of the issue: The update_cache function of the client downloads a Zip Archive containing the repo's pages. It then individually copies each page into the current machine's filesystem. At no point does it delete pages that are not present in the downloaded Zip file from the current machine's filesystem.

https://github.com/tldr-pages/tldr-python-client/blob/68acf0531668f3a248127bd4e25140ed420ceff9/tldr.py#L380-L407