bcgov / common-object-management-service

A microservice for managing access control to S3 Objects
https://bcgov.github.io/common-object-management-service/
Apache License 2.0
7 stars 9 forks source link

Requests to /sync endpoint resulting in 403 #270

Open acoard-aot opened 3 months ago

acoard-aot commented 3 months ago

Describe the bug

Hi there, first off, thanks for the software! It's entirely possible this is an error on my end, but I've gone over the documentation and worked on this issue for a while.

I am writing a script that's purpose is to call the /sync endpoint on every single bucket. I have authentication working for every endpoint I need besides /sync. Calling for example /bucket to list buckets works great, as does /status. However, when I call the /sync endpoint I get either two types of results.

Result 1

Sync response for bucket 98b739a2-ca0e-4f02-b215-cc4d55e3b064:
{"type":"https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/403","title":"Forbidden","status":403,"detail":{"name":"SignatureDoesNotMatch","$fault":"client","$metadata":{"httpStatusCode":403,"requestId":"8e22ee0a:19021c74657:b5c27:319a","extendedRequestId":"","attempts":1,"totalRetryDelay":0},"Code":"SignatureDoesNotMatch","RequestId":"8e22ee0a:19021c74657:b5c27:319a","message":"The request signature we calculated does not match the signature you provided. Check your Secret Access Key and signing method. For more information, see REST Authentication and SOAP Authentication for details."}}

This result type is 90% of my responses.

Result 2

This result is far more rare, and just says:

0

I'm assuming a response of 0 is following the Linux exit code convention where 0 = success.

(If useful, my full logs are this, but the first two lines are my scripts logging lines)

GOING TO SYNC 19a94bd1-bd39-4d31-9b4e-2fce4d9314a4
Sync response for bucket 19a94bd1-bd39-4d31-9b4e-2fce4d9314a4:
0

Version Number

To Reproduce

Steps to reproduce the behavior:

For simplicity I have copied my entire script here. It refers to an .env file which I will not include:

if [[ -f .env ]]; then
    export $(cat .env | grep -v '^#' | xargs)
else
    echo "Error: .env file not found"
    exit 1
fi

# Ensure CLIENT_ID and CLIENT_SECRET are set from .env
if [[ -z "$CLIENT_ID" || -z "$CLIENT_SECRET" ]]; then
    echo "Error: CLIENT_ID or CLIENT_SECRET not set in .env file"
    exit 1
fi

# Fetch the access token using client credentials grant
TOKEN=$(
    curl --location --request POST '[KEYCLOAK-URL-HERE]/auth/realms/[KEYCLOAK-REALM-HERE]/protocol/openid-connect/token' \
    -s \
    --header 'Content-Type: application/x-www-form-urlencoded' \
    --data-urlencode "grant_type=client_credentials" \
    --data-urlencode "client_id=$CLIENT_ID" \
    --data-urlencode "client_secret=$CLIENT_SECRET" | 
    jq -r .access_token
)

# List of all buckets - note each application has it's own 'bucket'. This is not the same bucket as OCIO provided us, rather
# this is a "COMS Bucket" that is created every time we create a COMS mounting.
BUCKET=$(curl -X GET "[COMS-URL-HERE]api/v1/bucket/" \
-H "Authorization: Bearer $TOKEN" )
echo "\nBUCKET: $BUCKET"
bucketIds=$(echo "$BUCKET" | jq -r '.[].bucketId')

echo "\nBucketIDs: $bucketIds"

for BUCKET_TO_SYNC in $bucketIds; do
    echo "GOING TO SYNC $BUCKET_TO_SYNC"
    # Make the API request to sync each bucket
    SYNC_RESPONSE=$(curl -s -X GET "[COMS-URL-HERE]api/v1/bucket/$BUCKET_TO_SYNC/sync" \
    -H "Authorization: Bearer $TOKEN")

    # Process the SYNC_RESPONSE as needed
    echo "Sync response for bucket $BUCKET_TO_SYNC:"
    echo "$SYNC_RESPONSE"
    echo  # Add a blank line for separation
done

In the .env file, there are two variables:

CLIENT_ID=c
CLIENT_SECRET=

Expected behavior

Since my authentication works for other endpoints (like /bucket), I expect it to work for /sync.

My question: does /sync have different auth requirements from /bucket? For example do I need to configure the Keycloak clients differently or setup any additional permissions to call /sync on a bucket?

Screenshots

N/A

Desktop (please complete the following information):

Smartphone (please complete the following information):

N/A

Additional context

norrisng-bc commented 3 months ago

Hi @acoard-aot ! Auth should be the same across all endpoints as far as I'm aware. I presume you're self-hosting COMS?

Based on the error you're seeing though, from a quick search it appears to be related to S3's authentication and not so much COMS. Best guess is that it's some weird edge case, perhaps with characters in the S3 credentials that require escaping (e.g. /, +) but it'll probably require more investigation.

norrisng-bc commented 3 months ago

Still investigating this bug (I've managed to reproduce it locally when running in basic auth mode, so almost certainly something to do with S3 rather than JWT/OIDC/Keycloak etc), but in the meantime:

This result is far more rare, and just says:

0

I'm assuming a response of 0 is following the Linux exit code convention where 0 = success.

The /sync endpoint returns the number of objects that have been added to the queue. In this particular case, nothing was added (e.g. the bucket is empty).

TimCsaky commented 3 months ago

Out of the box, the /bucket/<bucket id>/sync endpoint requires either

Did you modify your own instance of COMS? what kind of token would you be using? if youre using basic auth, you wuld just need this username/password: https://github.com/bcgov/common-object-management-service/blob/43693b6d932f9c67417d592773b2751a26b73d05/app/config/custom-environment-variables.json#L4

norrisng-bc commented 3 months ago

Looks like the only way I was able to replicate the issue was with incorrect S3 credentials.

COMS will verify that they work when you initially create the COMS bucket, but if the S3 credentials change after that it's not going to know until it tries to do something with the underlying S3 bucket (e.g. reading directly from it).

Updating the S3 credentials should fix it - you can do that with PATCH /bucket/:bucketID.

Just out of curiosity though, what's your setup here? Like Tim said we don't support client credentials at the moment so I'm interested in how you implemented an automated script without requiring manual user login.

acoard-aot commented 3 months ago

Thank you for the answers you've given here, I really appreciate it. I've been incredibly busy, but will be diving into this issue in more detail today. Perhaps it is a cred issue, and I appreciate the tip. I'm a bit surprised that the creds work for some "buckets" (in COMS) and not all when I don't believe that's how it's configured (we have one shared S3 bucket with same auth for this POC), but it's entirely possible I messed something up. Thanks again.

I will make sure to report back and close this ticket with my findings in the next couple days.