edwardspec / mediawiki-aws-s3

Extension:AWS allows MediaWiki to use Amazon S3 (instead of the local directory) to store images.
https://www.mediawiki.org/wiki/Extension:AWS
GNU General Public License v2.0
42 stars 32 forks source link

Can't upload on Backblaze #64

Closed octfx closed 1 year ago

octfx commented 1 year ago

I am currently seeing the following error when trying to upload files on Backblaze:

S3FileBackend: found backend with S3 buckets: [hidden]/images, [hidden]/images/thumb, [hidden]/images/deleted, [hidden]/images/temp.
S3FileBackend: found backend with S3 buckets: [hidden]/images, [hidden]/images/thumb, [hidden]/images/deleted, [hidden]/images/temp.
S3FileBackend: doPrepareInternal: S3 bucket [hidden], dir=images/temp/f/ff, params=noAccess, noListing, dir
S3FileBackend: doCreateInternal(): saving images/temp/f/ff/20230704091931!phpdXURJx.webp in S3 bucket [hidden] (sha1 of the original file: az3arzja9s80cwffggs2cwjk7layg10, Content-Type: image/webp)
S3FileBackend: exception InvalidArgument in createOrStore from PutObject (false): Error executing "PutObject" on "https://[hidden].s3.eu-central-003.backblazeb2.com/images/temp/f/ff/20230704091931%21phpdXURJx.webp"; AWS HTTP error: Client error: `PUT https://[hidden].s3.eu-cent>
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Error>
    <Code>InvalidArgument</Code>
    <Message>Unsupporte (truncated...)
 InvalidArgument (client): Unsupported value for canned acl 'private' - <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Error>
    <Code>InvalidArgument</Code>
    <Message>Unsupported value for canned acl 'private'</Message>
</Error>

S3FileBackend: Performance: 0.067 second spent on: uploading images/temp/f/ff/20230704091931!phpdXURJx.webp to S3
StoreFileOp failed: {"src":"/tmp/phpdXURJx","dst":"mwstore://AmazonS3/local-temp/f/ff/20230704091931!phpdXURJx.webp","overwrite":true,"headers":[],"failedAction":"attempt"}

I've migrated existing images to B2, and everything works just fine, except uploading. Any clues? :)

octfx commented 1 year ago

Found it. Per the documentation https://www.backblaze.com/docs/cloud-storage-s3-compatible-api#access-control-lists

The S3-Compatible API supports the Put Bucket ACL call to change between "private" and “public-read” only. Attempting to put a different value returns an error.

The call succeeds only when the specified ACL matches the ACL of the bucket.

Which in turn means B2 requires that all PUT calls have to match the bucket ACL

edwardspec commented 1 year ago

Which in turn means B2 requires that all PUT calls have to match the bucket ACL

We can't do that. The same bucket contains both public/thumb images (must be public) and deleted/temp images (must be private). They need different ACL.

octfx commented 1 year ago

I get that, but this seems to be a limitation of B2's S3 implementation.

A tested (and working) "fix" is to always return false for isSecure() calls, when the s3 endpoint is set to backblaze.

edwardspec commented 1 year ago

You can use $wgFileBackends['s3']['privateWiki'] = true; as a workaround, which will achieve the same behavior. Needless to say, S3 objects uploaded as "private" won't be served directly, only via img_auth.php.

A better approach is to use the old-style configuration (see tests/travis/OldStyleAWSSettings.php for an example) to put deleted/temp images into a separate S3 bucket (not the same as public/thumb).

In any case, it's an issue with the service and not with this extension, and you can workaround it.

qlyoung commented 1 year ago

For anyone confused about @edwardspec's answer, if I understand correctly, the implication is that in addition to setting $wgFileBackends['s3']['privateWiki'] = true; you would also set the bucket to private. Then all images would be uploaded as private but since this matches the bucket ACL of private then it would succeed.

As he says, after that point all images will be served via the MediaWiki host, which means you wouldn't get the benefit of offloading traffic to your S3 provider.

(This is not the first web service I have hit this with w.r.t. B2's annoying S3 behavior deviation here...)

@edwardspec - for a wiki that is completely public - world readable and writeable - are there any meaningful security implications for #65? I understand why you cannot take it, but for my personal use case, I'm thinking I'll just fork this extension and apply that patch. However, if there are security implications, I would be very grateful if you could enlighten me before I own myself :-)

edwardspec commented 1 year ago

are there any meaningful security implications for https://github.com/edwardspec/mediawiki-aws-s3/pull/65?

The reason why MediaWiki sets "deleted" zone to private is non-technical: files in it could have been deleted for reasons like being spam, having personal data, being indecent, due to copyright claims, etc.

If "deleted" zone is public, then anyone can still access them after deletion. If #65 was merged, some user who doesn't know this might have gotten in trouble because of it.

You can, of course, delete such files manually from S3 bucket to counter this problem, and it's less of a problem if all editors are trustworthy.