szabodanika / microbin

A secure, configurable file-sharing and URL shortening web app written in Rust.
https://microbin.eu
BSD 3-Clause "New" or "Revised" License
2.65k stars 163 forks source link

Orphaned pastas #63

Closed jchia closed 1 year ago

jchia commented 1 year ago

When the URL/identifier for a private, non-expiring, pasta is forgotten, the pasta may remain forever with nobody who can delete it except for some administrator or imaginary GC mechanism within microbin that has access to the underlying data files. This can result in endless garbage accumulation.

For this problem, perhaps there can be an administrative tool (e.g. program) to list the URLs/identifiers for pastas that meet certain criteria, optionally deleting them. For example, someone could be interested in private pastas that have not been accessed in the past n days. Potentially problematic concurrent access to the underlying data files by microbin and this tool needs to be considered.

Alternatively, there could be a switch to disallow non-expiring, private, pastas.

szabodanika commented 1 year ago

I like the idea of a garbage collector. It can either remove pastas that haven't been accessed in N days or we could set a max storage limit and if that is reached, the oldest ones get deleted. I want to hear some other users' opinion on this

Panja0 commented 1 year ago

A garbage collector would be a great addition imho. I would say pastas that haven't been accessed in N days is a good option.

jchia commented 1 year ago

I personally would welcome either a new microbin option specifying the GC policy to use or a standalone program that does the GC. For preventing concurrent file access, flock() can be used on files and sqlite also has ways to prevent/coordinate concurrent access.

As for the choice of GC policy, I think allowing the user to choose the following would be good:

There is the question of whether GC is meant to apply to items with an expiration time. If not, then the above 3rd criterion is moot.

szabodanika commented 1 year ago

This has been implemented in v1.2.0. By default pastas that haven't been accessed for 90 days are deleted. This can be changed with the --gc-days argument