archivematica / Issues

Issues repository for the Archivematica project
GNU Affero General Public License v3.0
16 stars 1 forks source link

Problem: S3 spaces fail with the global endpoint URL #1655

Closed replaceafill closed 4 months ago

replaceafill commented 8 months ago

Expected behaviour

S3 spaces work as documented.

Current behaviour

S3 spaces don't work if their S3 Endpoint URL field is set to the global S3 endpoint URL (https://s3.amazonaws.com) which is used as an example in the user documentation and in the Storage Service user interface.

Any operation on the space and bucket will result in an exception similar to:

botocore.exceptions.ClientError: An error occurred (PermanentRedirect) when calling the CreateMultipartUpload operation: The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.

This problem was introduced in https://github.com/artefactual/archivematica-storage-service/commit/ea8d7718476ed762ad1899a6f9bdbb1f6ffbfee3 where the botocore library was upgraded from version 1.26.10 to version 1.31.35. It seems version 1.28.0 introduced a change where the endpoint URL is now derived from the service that is being used.

@mamedin found that a workaround to this problem is to use a regional endpoint (e.g https://s3.us-west-2.amazonaws.com) in the S3 Endpoint URL field that matches the Region field where the bucket is created.

Steps to reproduce

  1. Set up an S3 space using https://s3.amazonaws.com for the S3 Endpoint URL field and set up an AIP Storage location in it.
  2. Run a transfer and send it to the S3 AIP Storage location.
  3. Processing will fail at the Store the AIP job in the Dashboard and you'll see a ClientError error in your Storage Service logs.

Your environment (version of Archivematica, operating system, other relevant details)

https://github.com/artefactual/archivematica/commit/43328a1756ae1235b2e6c0610527141db272d6f9 https://github.com/artefactual/archivematica-storage-service/commit/d2907d6195c74e08b26ae8d832984e27ecb417d4


For Artefactual use:

Before you close this issue, you must check off the following:

scollazo commented 6 months ago

As a workaround, an rclone space can be used

replaceafill commented 4 months ago

The global endpoint URL https://s3.amazonaws.com works now.