CERT-Polska / karton

Distributed malware processing framework based on Python, Redis and S3.
https://karton-core.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
381 stars 45 forks source link

Fix IAM token expiration #234

Closed yankovs closed 7 months ago

yankovs commented 10 months ago

This PR solves the issue described in #233. Botocore's *Provider classes such as ContainerProvider return a RefreshableCredentials object when calling their load method. This object knows when and how to refresh the session token, so there's no need to implement custom code to do that. In current code, this object is thrown away so it gets no chance to auto-refresh the token.

Instead, use a lower level session object and if possible, attach this RefreshableCredentials object to it, and create the s3 client from it.

Tested on MWDB v2.10.2 with a karton that uses v5.3.0 of karton-core with the diff in this PR patched into it and seems to still work even a couple of hours after the karton is up (whereas the current code fails). Kept as draft until I'm sure it solves the issue

Fixes #233

yankovs commented 10 months ago

It's really annoying that my solution has to access internal members of botocore classes (hence the # type: ignores), but seems like this is the way to go right now: https://github.com/boto/botocore/pull/1742#issuecomment-1574235418

msm-code commented 10 months ago

Hi! Looks good I think.

What do you think about splitting the IAM and "regular" logic? Right now it's a bit tangled. I'm thinking of something like:

def init_snippet():
    # first part of the __init__ method
    if iam_auth:
        self.s3 = iam_auth()
        return  # or `if self.s3 is not None: return` to try fallback, like the current code does

    if access_key is None or secret_key is None:
        raise RuntimeError(
                "Attempting to get S3 client without an access_key/secret_key set"
            )
    self.s3 = boto3.Session().client(
        "s3",
        endpoint_url=endpoint,
        aws_access_key_id=access_key,
        aws_secret_access_key=secret_key,
    )

def iam_auth():
    boto_session = get_session()
    iam_providers = [
        ContainerProvider(),
        InstanceMetadataProvider(
            iam_role_fetcher=InstanceMetadataFetcher(
                timeout=1000, num_attempts=2
            )
        ),
    ]
    for provider in iam_providers:
        creds = provider.load()
        if creds:
            boto_session._credentials = creds  # type: ignore
            return boto3.Session(botocore_session=boto_session).client(
                "s3",
                endpoint_url=endpoint,
            )

(not tested ofc, but I think it's significantly easier to understand)

psrok1 commented 10 months ago

Btw, similar code is used in MWDB, so when we agree on final solution I'll be happy to apply the change to mwdb-core as well: https://github.com/CERT-Polska/mwdb-core/blob/master/mwdb/core/util.py#L164

yankovs commented 10 months ago

Hi! Looks good I think.

What do you think about splitting the IAM and "regular" logic? Right now it's a bit tangled. I'm thinking of something like:

def init_snippet():
    # first part of the __init__ method
    if iam_auth:
        self.s3 = iam_auth()
        return  # or `if self.s3 is not None: return` to try fallback, like the current code does

    if access_key is None or secret_key is None:
        raise RuntimeError(
                "Attempting to get S3 client without an access_key/secret_key set"
            )
    self.s3 = boto3.Session().client(
        "s3",
        endpoint_url=endpoint,
        aws_access_key_id=access_key,
        aws_secret_access_key=secret_key,
    )

def iam_auth():
    boto_session = get_session()
    iam_providers = [
        ContainerProvider(),
        InstanceMetadataProvider(
            iam_role_fetcher=InstanceMetadataFetcher(
                timeout=1000, num_attempts=2
            )
        ),
    ]
    for provider in iam_providers:
        creds = provider.load()
        if creds:
            boto_session._credentials = creds  # type: ignore
            return boto3.Session(botocore_session=boto_session).client(
                "s3",
                endpoint_url=endpoint,
            )

(not tested ofc, but I think it's significantly easier to understand)

Sure, if you guys think this is cleaner I have no real objection :)

I do want to note however, that during some reading in the botocore repo I noticed they have some function called create_credential_resolver and inside all of the different providers they have are created - not only ContainerProvider etc.

So instead of selecting a few providers (in our case 2), we can call create_credential_resolver and let it search for credentials in all places they support: env variables, config in ~/.boto file, etc. This will make iam_auth meaningless however. Instead, maybe require_hardcoded_keys or something like that will make more sense.

Edit: this^ will require some more thinking and work however. The implementation proposed in the PR tested to be working as is.