openzim / cms

ZIM file Publishing Platform
https://cms.openzim.org
GNU General Public License v3.0
4 stars 0 forks source link

Files posted in .hidden/dev should have an expiry date #44

Open kelson42 opened 2 years ago

kelson42 commented 2 years ago

From zimfarm created by Popolechien: openzim/zimfarm#650

(not sure if this is more appropriate in /CMS, but anyway)

Files that are assigned to .hidden/dev are mostly to test out whether a given recipe works and their long-term value is negligible. They linger and accumulate in dev.library.org, which itself has quite a few limitations in the coherence of things it displays (it could for example list files by order of latest creation).

I therefore don't think there would be any visible issue if files were deleted after a month (or even 10 days).

kelson42 commented 2 years ago

The point is not to put publishing constraint earlier in the (dev) process but to (re)-use the same CMS machinery to make available, garbage collect, etc... the ZIM file during their dev/qual stage.

kelson42 commented 2 years ago

It's probably more suited for maintenance repo than zimfarm or CMS (which will never have anything to do with dev).

The only problem you're reporting is that dev.library is cluttered. I want to answer that it's kiwix-serve's job to make it easy to deal with ZIM files.

Now I know that part of the confusion is due to having multiple versions of the same ZIM but that's not going to change. It's /dev's purpose to host to tests.

You are right in saying that the value of the tests eventually reaches 0 and when it does, it's taking up a bit of server space and a bit of visual space in that page so yes, we should eventually delete those.

I am against periodically deleting them without warning and 10days or a month are not reasonable duration. What we could do though is have a maintenance script look for aging ZIMs in that folder and send Slack notifications. Those would warn about future deletion which will happen after a good amount of time.

I suggest a periodic script launched on beginning of the month sending a summary of the aging/to-delete files. Developer still has the opportunity to manually delete any file or touch one it wants age to reset (to keep it).

IMO, deletion should happen after 3mo, warning a month before. Would that work for you?

Sample message:

⚠️ The following file(s) WILL BE DELETED in 48h ⚠️ touch them to keep them!

  • wikipedia_zh-min-nan_all_2021-11.zim – 34.2MiB – created on Sun August 1st 12:00

The following file(s) are scheduled for deletion next month:

  • wikipedia_zh-min-nan_all_2021-11.zim – 34.2MiB – created on Sun August 1st 12:00
  • wikipedia_zh-min-nan_all_2021-11.zim – 34.2MiB – created on Sun August 1st 12:00
  • wikipedia_zh-min-nan_all_2021-11.zim – 34.2MiB – created on Sun August 1st 12:00
kelson42 commented 2 years ago

Sounds reasonable, as long as we get to clean things up.

kelson42 commented 2 years ago

What about considering the dev library like another library, to be handled like the public library.kiwix.org?

what does that mean?

Same problems should lead to the same solutions? If we solve properly the problem (with the help of the CMS) of library.kiwix,org, we should be able to solve it similarly for dev.library.kiwix.org?!

dev.library.org is a tool for developers, as its name suggest. At some point, other people need to test and validate ZIM files. Having a kiwix-served version greatly facilitates those tasks ; which is why we have it. it is not a place to browse for content. We could get rid of the kiwix-serve homepage actually.

If you consider it a public library, it means it's not a dev tool anymore.

kelson42 commented 2 years ago

Prior warning does make sense and Slack is where everyone is atm so yes, good idea. Three months feels very long, but whatever.

kelson42 commented 2 years ago

Like described by Renaud the problem is slightly more complex than it looks like. What about considering the dev library like another library, to be handled like the public library.kiwix.org? Same problems should lead to the same solutions? If we solve properly the problem (with the help of the CMS) of library.kiwix,org, we should be able to solve it similarly for dev.library.kiwix.org?!

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.