superseriousbusiness / gotosocial

Fast, fun, small ActivityPub server.
https://docs.gotosocial.org
GNU Affero General Public License v3.0
3.68k stars 311 forks source link

Separate storage configuration for media caching #2832

Closed JJGadgets closed 5 months ago

JJGadgets commented 5 months ago

Is your feature request related to a problem?

I would like to backup my GoToSocial instance, but I would like to avoid eating up my backup storage capacity with cached media.

Describe the solution you'd like.

Separating out storage configurations for media related to the GoToSocial instance (post uploads, profile pictures, instance logo etc) and cached media from other instances which can be refetched. I'm thinking that the current storage config syntax can stay, and a new set for cached media can be added (copy-paste existing but changing storage-* with e.g. storage-cache-*) so existing users aren't affected by the change.

It could also be optional, e.g. if the user doesn't specify the separate storage-cache-* configurations, use the storage-* values by default.

Describe alternatives you've considered.

Using 1 set of storage configurations, but using a cache subpath within the given storage path which can be excluded. However, this feels less flexible, and won't allow users to specify e.g. different S3 configurations (or different backends) for instance media and cached media.

Additional context.

I run GoToSocial in Kubernetes, with Postgres and S3 to avoid using PVCs directly. However, I currently store the S3 bucket used by GtS on my local Ceph RGW.

I would like to either backup the Ceph S3 bucket, or switch GoToSocial entirely to use a cloud S3 provider like Backblaze, Wasabi, or Cloudflare R2, but I don't want to incur S3 costs for cached media, neither do I want to upload other users' media cached on my instance to a cloud provider without their prior consent. If I switch my instance to use cloud S3, it means I myself accept where I'm uploading my media, but that doesn't mean others have.

And either way, even with a full PVC setup (meaning SQLite and local filesystem for media), I don't wanna waste storage space and costs backing up cached media :p.

tsmethurst commented 5 months ago

Hiya! Gonna close this as it's a dupe of https://github.com/superseriousbusiness/gotosocial/issues/1776 :) feel free to follow the other issue for updates and stuff, we'll get to it at some point.