girlbossceo / conduwuit

a very cool, featureful fork of conduit
https://conduwuit.puppyirl.gay/
Apache License 2.0
302 stars 41 forks source link

Adding S3 client #93

Open kubo6472 opened 8 months ago

kubo6472 commented 8 months ago

No idea what the rewrite mentioned here was about, but I wanted to ask if there are any plans at implementing an option to easily specify S3, B2, R2... bucket to store all the media in. Ideally with configurable options to replicate both locally and remotely, or to only save the files remotely.

On an unrelated note, specifically for Backblaze B2, there seems to be a pricing waive of eggress fees when using cloudflare workers to access the data, e.g. with something like this. Maybe this will result in an easier implementation, if it just loaded URLs for the media. However I doubt it has any security and not everyone uses Cloudflare.

You can take advantage of Cloudflare's performance and the free data transfer between Backblaze B2 and Cloudflare: ref.

girlbossceo commented 8 months ago

No idea what the rewrite mentioned here was about,

As usual this never happened and I could not find any kind of progress anywhere, which was why I merged in the SHA256 media key stuff (the supposed rewrite was mentioned there too). It definitely could use improvements though.

but I wanted to ask if there are any plans at implementing an option to easily specify S3, B2, R2... bucket to store all the media in. Ideally with configurable options to replicate both locally and remotely, or to only save the files remotely.

I don't use things like S3 but I'm willing to accept an implementation of it with whatever features are necessary. Perhaps that MR with some changes/improvements.

I would be willing to accept that MR you linked if:

It would be nice to split up uploads, media, and thumbnails as well instead of the current situation where everything is thrown into one media folder, but I won't make that a blocker. It might be relevant to have separate buckets for uploads and downloads/thumbnails though.

If the author of that is still around (seems like it was last rebased 5 months ago), feel free to ping them about my fork, get them to see these concerns I listed, help get them implemented, test it by someone who uses S3, and I'll merge it.

kubo6472 commented 8 months ago

Thanks for the very structured response. I've looked at the (now archived) beeper's media repo, which is a fork of this and that has quite a big disclaimer for small servers specifically for S3 (and compatible) to use this instead.

I was hoping that it could be integrated directly with conduit as the config options, however I myself have zero knowledge of Rust and even less with how existing code's written.

That being said I really hope the original author will re-appear and possibly help contribute this.

girlbossceo commented 8 months ago

You are right it mostly appears to be just a bunch of config options. I can consider merging this into my fork, but I will only feel comfortable doing it if I add a clear and explicit warning that this (currently) has no cache and someone malicious can likely incur high-costs to you just by spamming media requests, and performance of media retrieval may be worse when using S3 due to high round trip times. And I will keep this issue open because the existing implementation requires these improvements, and likely can be improved with just supporting media redirection from MSC3860.

girlbossceo commented 8 months ago

Will merge https://gitlab.com/famedly/conduit/-/merge_requests/384 in with the warnings I want, and do a check on the code, but this issue will stay open with regards to what I said. Someone malicious can incur high costs to you by spamming requests (could be mitigated with https://github.com/girlbossceo/conduwuit/issues/98) and performance may or may not be significantly worse if roundtrip times are high with no cache.

zoujiaqing commented 5 months ago

Use minio instead of aws s3.

AndSDev commented 4 months ago

Someone malicious can incur high costs to you by spamming requests

The cache will not save you from an attacker (will help, but won't save). Matrix does not have protection against spam requests (except of rate-limits for requests). So a warning is needed anyway.

P.S. This MR was designed for use with minio. Ideally, conduit should be a stateless service.