thelastpickle / cassandra-medusa

Apache Cassandra Backup and Restore Tool
Apache License 2.0
264 stars 142 forks source link

ability to use role (iam assume role, assume role with web identity ...) #692

Closed JBOClara closed 8 months ago

JBOClara commented 11 months ago

Project board link

The pull request #442 introduces a regression that makes Medusa unable to use roles.

Unfortunately, the pull request #622 was not successful in restoring the functionality.

The error

Open me ``` MEDUSA_MODE = GRPC sleeping for 0 sec Starting Medusa gRPC service restore MEDUSA_MODE = RESTORE restore sleeping for 0 sec restore Running Medusa in restore mode restore BACKUP_NAME env var not set, skipping restore operation WARNING:root:The CQL_USERNAME environment variable is deprecated and has been replaced by the MEDUSA_CQL_USERNAME variable WARNING:root:The CQL_PASSWORD environment variable is deprecated and has been replaced by the MEDUSA_CQL_PASSWORD variable WARNING:root:The CQL_USERNAME environment variable is deprecated and has been replaced by the MEDUSA_CQL_USERNAME variable WARNING:root:The CQL_PASSWORD environment variable is deprecated and has been replaced by the MEDUSA_CQL_PASSWORD variable INFO:root:Init service [2023-11-30 20:29:10,510] INFO: Init service INFO:root:Starting server. Listening on port 50051. [2023-11-30 20:29:10,511] INFO: Starting server. Listening on port 50051. DEBUG:root:Loading storage_provider: s3 [2023-11-30 21:35:16,892] DEBUG: Loading storage_provider: s3 DEBUG:root:Setting AWS credentials file to /etc/medusa-secrets/credentials [2023-11-30 21:35:16,900] DEBUG: Setting AWS credentials file to /etc/medusa-secrets/credentials INFO:root:Using credentials CensoredCredentials(access_key_id=A..E, secret_access_key=*****, region=us-east-1) [2023-11-30 21:35:17,112] INFO: Using credentials CensoredCredentials(access_key_id=A..E, secret_access_key=*****, region=us-east-1) INFO:root:Using S3 URL https://by-bucket.s3.amazonaws.com [2023-11-30 21:35:17,113] INFO: Using S3 URL https://by-bucket.s3.amazonaws.com DEBUG:root:Connecting to S3 [2023-11-30 21:35:17,113] DEBUG: Connecting to S3 DEBUG:root:[Storage] Listing objects in cassandra-tests/index/backup_index [2023-11-30 21:35:17,208] DEBUG: [Storage] Listing objects in cassandra-tests/index/backup_index WARNING:root:Having to make a new event loop unexpectedly [2023-11-30 21:35:17,209] WARNING: Having to make a new event loop unexpectedly DEBUG:asyncio:Using selector: EpollSelector [2023-11-30 21:35:17,209] DEBUG: Using selector: EpollSelector DEBUG:root:[Storage] Listing objects in cassandra-tests/index/backup_index [2023-11-30 21:35:37,369] DEBUG: [Storage] Listing objects in cassandra-tests/index/backup_index DEBUG:root:[Storage] Listing objects in cassandra-tests/index/backup_index [2023-11-30 21:36:17,500] DEBUG: [Storage] Listing objects in cassandra-tests/index/backup_index ... DEBUG:root:[Storage] Listing objects in cassandra-tests/index/backup_index [2023-11-30 21:43:38,307] DEBUG: [Storage] Listing objects in cassandra-tests/index/backup_index DEBUG:root:Disconnecting from S3... [2023-11-30 21:43:38,408] DEBUG: Disconnecting from S3... ERROR:grpc._server:Exception calling application: 'GetBackupsResponse' object has no attribute 'status' Traceback (most recent call last): File "/home/cassandra/medusa/service/grpc/server.py", line 197, in GetBackups backups = get_backups(connected_storage, self.config, True) File "/home/cassandra/medusa/listing.py", line 26, in get_backups cluster_backups = sorted( File "/home/cassandra/medusa/storage/__init__.py", line 358, in list_cluster_backups node_backups = sorted( File "/home/cassandra/medusa/storage/__init__.py", line 179, in list_node_backups backup_index_blobs = self.list_backup_index_blobs() File "/home/cassandra/medusa/storage/__init__.py", line 270, in list_backup_index_blobs return self.storage_driver.list_objects(path) File "/home/cassandra/.local/lib/python3.10/site-packages/retrying.py", line 56, in wrapped_f return Retrying(*dargs, **dkw).call(f, *args, **kw) File "/home/cassandra/.local/lib/python3.10/site-packages/retrying.py", line 266, in call raise attempt.get() File "/home/cassandra/.local/lib/python3.10/site-packages/retrying.py", line 301, in get six.reraise(self.value[0], self.value[1], self.value[2]) File "/home/cassandra/.local/lib/python3.10/site-packages/six.py", line 719, in reraise raise value File "/home/cassandra/.local/lib/python3.10/site-packages/retrying.py", line 251, in call attempt = Attempt(fn(*args, **kwargs), attempt_number, False) File "/home/cassandra/medusa/storage/abstract_storage.py", line 71, in list_objects objects = self.list_blobs(prefix=path) File "/home/cassandra/medusa/storage/abstract_storage.py", line 79, in list_blobs objects = loop.run_until_complete(self._list_blobs(prefix)) File "/usr/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete return future.result() File "/home/cassandra/medusa/storage/s3_base_storage.py", line 206, in _list_blobs ).build_full_result() File "/home/cassandra/.local/lib/python3.10/site-packages/botocore/paginate.py", line 479, in build_full_result for response in self: File "/home/cassandra/.local/lib/python3.10/site-packages/botocore/paginate.py", line 269, in __iter__ response = self._make_request(current_kwargs) File "/home/cassandra/.local/lib/python3.10/site-packages/botocore/paginate.py", line 357, in _make_request return self._method(**current_kwargs) File "/home/cassandra/.local/lib/python3.10/site-packages/botocore/client.py", line 535, in _api_call return self._make_api_call(operation_name, kwargs) File "/home/cassandra/.local/lib/python3.10/site-packages/botocore/client.py", line 980, in _make_api_call raise error_class(parsed_response, operation_name) botocore.exceptions.ClientError: An error occurred (InvalidAccessKeyId) when calling the ListObjectsV2 operation: The AWS Access Key Id you provided does not exist in our records. ```

How to reproduce

(OR)

The script uses the same logic as shown in this _consolidate_credentials method:

https://github.com/thelastpickle/cassandra-medusa/blob/master/medusa/storage/s3_base_storage.py#L189-L217.

from os import getenv
import botocore.session
import boto3
from botocore.config import Config
region = getenv("AWS_DEFAULT_REGION")

session = botocore.session.Session()

credentials = session.get_credentials()

# make the pool size double of what we will have going on
# helps urllib (used by boto) to reuse connections better and not WARN us about evicting connections
max_pool_size = 2 * 2

boto_config = Config(
        region_name=region,
        signature_version='v4',
        tcp_keepalive=True,
        max_pool_connections=max_pool_size
    )

s3_client = boto3.client(
's3',
config=boto_config,
aws_access_key_id=credentials.access_key,
aws_secret_access_key=credentials.secret_key,
# aws_session_token=credentials.token, # commented on purpose
)

bucket_name = 'by-bucket'
prefix = 'cassandra-tests'

response = s3_client.list_objects_v2(Bucket=bucket_name, Prefix=prefix)

# Imprimer les noms des objets
for obj in response['Contents']:
    print(obj['Key'])

Output

Traceback (most recent call last):
  File "test.py", line 39, in <module>
    response = s3_client.list_objects_v2(Bucket=bucket_name, Prefix=prefix)
  File "/usr/local/lib/python3.7/site-packages/botocore/client.py", line 553, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/local/lib/python3.7/site-packages/botocore/client.py", line 1009, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (InvalidAccessKeyId) when calling the ListObjectsV2 operation: The AWS Access Key Id you provided does not exist in our records.

The same script with aws_session_token

s3_client = boto3.client(
's3',
config=boto_config,
aws_access_key_id=credentials.access_key,
aws_secret_access_key=credentials.secret_key,
aws_session_token=credentials.token, # commented on purpose
)

output

python3 test.py
cassandra-tests/
cassandra-tests/index/backup_index/cassandra2-full-2301130-1/differential_ip-10-18-0-20.ec2.internal
cassandra-tests/index/backup_ind

As you can see, the aws_session_token is missing which make medusa unable to authenticate with any method that use temporary token.

WIP https://github.com/thelastpickle/cassandra-medusa/pull/691

victorgitmain commented 11 months ago

Removing the lines

aws_access_key_id=credentials.access_key,
aws_secret_access_key=credentials.secret_key,

seem to make boto3 to work with the role I tested it by branching and building . maybe a pull request to make conditional s3_client could help here

s3_client = boto3.client(

if credentials are not given create client with no credentials (in this case the role will be used) otherwise use the given access_key and secret_access_key

rzvoncek commented 8 months ago

Check if this is actually fixed.

rzvoncek commented 8 months ago

We checked this with @adejanovski, and we established it's actually fixed. Closing.