lithops-cloud / lithops

A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀
http://lithops.cloud
Apache License 2.0
317 stars 105 forks source link

Rebuilding and deploying runtime after temporal credentials expire #1107

Closed abourramouss closed 1 year ago

abourramouss commented 1 year ago

There is a small inconvenient when using temporal credentials with AWS. Each time the credentials expire (After 12 hours) rebuild and redeployment of the runtime is needed. Altought the runtime is already deployed in ECR.

This is because for deployment of the runtime, some token of the credentials is used to name the runtime, and after the credential change, lithops cannot find the runtime name.

JosepSampe commented 1 year ago

What do you mean by temp credentials? which parameters do you use to configure lithops? and how can I replicate the issue?

Which name is changing due temp credentials? ECR conatainer name? or Lambda function name?

abourramouss commented 1 year ago

Temporary credentials are Short-term credentials, those can be retrieved from the aws cli, clicking on "Command line or programmatic access"

image

The config should be like this:

lithops: backend: aws_lambda storage: aws_s3 aws: access_key_id: "access_key_id" secret_access_key: "access_key" session_token: "token"

both ECR container name and lambda function name change like this, each line represents a new set of credentials:

[lithops_v2-9-0_y6fn/runtime_name]

new set of credentials (the older ones have expired), then lithops searches for runtime_name, but the last 4 characters of the "lithopsv2-9-0" have changed (This is due to lithops taking config to create the name). Then, new set of credentials searches for:

[lithops_v2-9-0_4uf6/runtime_name]

You can see runtime_name is the same, but the characters before the runtime_name are different.

The same happens with the lambda function name:

[lithops_v2-9-0_y6fn__runtime_name_3008MB]

becomes:

[lithops_v2-9-0_4uf6__runtime_name_3008MB]

after the credential change.

Hope it helped a bit to clarify the issue.

aitorarjona commented 1 year ago

Working in it.

In the mean time, you can run lithops delete --all to delete runtimes from expired sessions and build a new one for each session.

JosepSampe commented 1 year ago

I think we should consider splitting the PR #1114 into two, since, if I'm not missing something, the code necessary to fix this issue only requires these 6 lines of code:

 sts_client = self.aws_session.client('sts', region_name=self.region_name)
 caller_id = sts_client.get_caller_identity()

 if ":" in caller_id["UserId"]:  # SSO user
     self.user_key = caller_id["UserId"].split(":")[1]
 else:  # IAM user
     self.user_key = caller_id["UserId"][-4:].lower()

and the PR also completely changes the way to configure the AWS backends, which needs to be taken very carefully and make sure nothing is broken before merging it, which involves a lot of extensive testing and rigorous review.

WDYT @aitorarjona?

aitorarjona commented 1 year ago

Yes, @JosepSampe your proposal should fix the problem in this issue. However, the user would still neet to upgrade their credentials every new session. But I agree. I will prepare a PR soon and test it to fix only this issue.