openzim / overview

:balloon: Start here for current projects, how to get involved, and joining community calls. A resource for new and veteran members of the offline commmunity
2 stars 1 forks source link

Cleanup ZIMs - procedure & tooling #27

Open benoit74 opened 6 months ago

benoit74 commented 6 months ago

Currently, we do not have any precise procedure or tooling around cleanup of ZIMs.

There are many topics that should be considered:

I would propose to :

The idea of marking files comes from the fact that:

It has some drawbacks:

Proposal of rules (in TOML because it is a config file format for humans and I expect to write the tool in Python which promotes TOML significantly, but in fact I don't really care)

[delete_rules.dev]
folder="/data/hidden-zim"
delete_rule="file_older_than_days"
delete_threshold=30
force_keep=[
  "manioc.org_fr_all_2023-01.zim"
]

[delete_rules.custom_apps]
folder="/data/custom-apps"
delete_rule="all_but_last_book"
force_delete=[
 "my_oudated_app_2023-01.zim"
]

[delete_rules.to_delete]
folder="/data/to_delete"
delete_rule="last_folder_older_than_days"
delete_threshold=30
delete_empty_folders=true

With the following meanings:

I think that this tool will be used for other cleanup duties:

WDYT?

rgaudin commented 6 months ago

LGTM ; I can't find the other discussion but found this (dont look at the rest of the ticket) which is a bit similar. I find your approach better in several ways: commit to mark stuff we want to keep ~forever (so we'll get a commit message) and a short duration to deletion (otherwise there's the risk of postponing it then missing the deadline)