Open dxu-sfx opened 2 months ago
THe error is
[2024-04-24 23:06:23,294] DEBUG: [Storage] Getting object demo/demo-dc1-rack1-sts-0/test/meta/schema.cql
[2024-04-24 23:06:23,295] DEBUG: Using selector: GeventSelector
--- Logging error ---
Traceback (most recent call last):
File "/home/cassandra/medusa/storage/s3_base_storage.py", line 326, in _stat_blob
resp = self.s3_client.head_object(Bucket=self.bucket_name, Key=object_key)
File "/home/cassandra/.venv/lib/python3.10/site-packages/botocore/client.py", line 535, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/home/cassandra/.venv/lib/python3.10/site-packages/botocore/client.py", line 980, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden
During handling of the above exception, another exception occurred:
...
Arguments: (ClientError('An error occurred (403) when calling the HeadObject operation: Forbidden'),)
[2024-04-24 23:06:23,369] ERROR: Error getting object from s3://o11y-k8ssandra-medusa/demo/demo-dc1-rack1-sts-0/test/meta/schema.cql
[2024-04-24 23:06:23,369] INFO: Starting backup using Stagger: None Mode: differential Name: test
[2024-04-24 23:06:23,369] DEBUG: Updated from existing status: -1 to new status: 0 for backup id: test
[2024-04-24 23:06:23,370] DEBUG: Process psutil.Process(pid=670, name='medusa', status='running', started='23:06:21') was set to use only idle IO and CPU resources
[2024-04-24 23:06:23,370] INFO: Saving tokenmap and schema
[2024-04-24 23:06:23,627] DEBUG: Checking placement using dc and rack...
[2024-04-24 23:06:23,627] INFO: Resolving ip address 10.124.38.28
[2024-04-24 23:06:23,628] INFO: ip address to resolve 10.124.38.28
[2024-04-24 23:06:23,630] DEBUG: Resolved 10.124.38.28 to demo-dc1-rack1-sts-0
[2024-04-24 23:06:23,630] DEBUG: Checking host 10.124.38.28 against 10.124.38.28/demo-dc1-rack1-sts-0
[2024-04-24 23:06:23,631] INFO: Resolving ip address 10.124.168.112
[2024-04-24 23:06:23,631] INFO: ip address to resolve 10.124.168.112
[2024-04-24 23:06:23,635] DEBUG: Resolved 10.124.168.112 to demo-dc1-rack3-sts-0
[2024-04-24 23:06:23,635] INFO: Resolving ip address 10.124.38.28
[2024-04-24 23:06:23,635] INFO: ip address to resolve 10.124.38.28
[2024-04-24 23:06:23,637] DEBUG: Resolved 10.124.38.28 to demo-dc1-rack1-sts-0
[2024-04-24 23:06:23,637] INFO: Resolving ip address 10.124.76.56
[2024-04-24 23:06:23,638] INFO: ip address to resolve 10.124.76.56
[2024-04-24 23:06:23,640] DEBUG: Resolved 10.124.76.56 to demo-dc1-rack2-sts-0
[2024-04-24 23:06:23,700] DEBUG: [S3 Storage] Uploading object from stream -> s3://o11y-k8ssandra-medusa/demo/demo-dc1-rack1-sts-0/test/meta/schema.cql
[2024-04-24 23:06:23,711] ERROR: An error occurred (InvalidAccessKeyId) when calling the PutObject operation: The AWS Access Key Id you provided does not exist in our records.
we've noticed every time it generates a new access key while making the connection, not the one in the config. the config is being read, but the one in the actual connection is not the same.
Finally, we are able to see the issue, if we are using iam role, everytime, medus is using a temporary role keyid for the connection, it will skip reading the /etc/medusa-secrets/credentials .
2nd ,we are seeing the issue with botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden
from the setup persrptvice.
However, if we are using iam user, we have to remove SA setup, so it will take default /etc/medusa-secrets/credentials . and do the backups without issue.s
Hello @dxu-sfx !
It has been some time since we had an issue with this, so I'm a bit rusty on this topic.
Just like the documentation says, you first need to create an IAM Policy to declare what permissions should be granted. Then you have two options - assign this policy to a role or to a user.
The user aproach seems to be what you're already doing. You create the user, attach the policy to it, generate credentials for the user, place them on the node and reference them in the config file.
The idea behind the role is that you can skip a bunch of this. What you do is configure the instance itself (or the container) to assume this role. This means the instance will implicitly run with the permissions of this role. Then, in Medusa the boto library we use for interaction with s3, will first look for the credentials file. Then there might be few other authentication methods it tires, but if nothing works, it'll try to query the AWS metadata API to work out the roles (and temporary credentials). If it finds out the role is assumed, it'll authenticate and proceed.
So, in conclusion, please check if you have the assume role thing set up, and try removing the credentials from the config (and the file system).
Hello @dxu-sfx ! Did you manage to work this out? Is there something more we can help with?
Project board link
Hello there,
I am having an issue in using iam user & role properly. folliwng https://github.com/thelastpickle/cassandra-medusa/blob/edb76efd6078715a6311e24e1a1fd08641e92810/docs/aws_s3_setup.md#create-an-aws-iam-role-or-aws-iam-user-for-backups
Here is the medusa container, where I configured the s3 and key files , medusa standalone is having same config as this containers
keyid and key is created and match with what I have in the S3 user
This is my medus yaml file.
It seems I can connect to S3, but ever since it is trying to upload file, it is throwing issue to me
I can leverage my script and it can properly talk to s3 with my S3 bucket which works, However, what could be the issue if this is throwing the issue to me if medusa is running the same process itself?
Can you reproduce this or do you have any clue how should I debug this?