ptpb / pb

pb is a formerly-lightweight pastebin and url shortener
Other
549 stars 52 forks source link

Remove pastes after a period since last read? #171

Closed tmplt closed 7 years ago

tmplt commented 7 years ago

p.iotek.org has the nice feature of:

files >100KiB get deleted 72 hours after last read
files <100KiB get deleted two weeks after last read

Could optional rules like these be implemented? I believe this is related to #62, unless that issue focuses on the deployment at ptpb.pw.

buhman commented 7 years ago

62 is completely unrelated.

deleted 72 hours after last read

We have a similar feature, which is referred to as sunset in the documentation. Instead of constant time after last read it is number of seconds after paste is created.

after last read

I dislike this idea a lot. Currently, pb is able to have pastes cached by a proxy like a CDN or, currently, varnish. This significantly improves GET performance for the typical use-case: a paste is uploaded, then a bunch of different people look at it all at the ~same time.

The implementation details of after last read would not only require that no caching is performed, but every GET operation also involves a DB write to update some last-access metadata. This would significantly reduce pb performance.

As a result, I consider the feature request as written to be an anti-feature.

buhman commented 7 years ago

Also, the part where you actually delete something is also tricky. If this feature is intended as a capacity reduction technique, unlike sunset, where the paste isn't actually deleted until a GET request happens after the paste has expired, after last read would require some micro-service to constantly query for pastes that need purging.

It also means that the computational requirements of running pb scale linearly with the number of pastes that exists, whereas as of #172 computational requirements are ~constant in regardless of the number of pastes.

tmplt commented 7 years ago

That makes a lot of sense; I had not considered the performance issues related to the feature. Thanks for the explanation.