jschneier / django-storages

https://django-storages.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
2.72k stars 852 forks source link

Error upload file. Use zappa+ lambda + s3 + django2 + RDS. #606

Open davidllauce opened 5 years ago

davidllauce commented 5 years ago
When I save in local it keeps me well in S3 but when it is running with lamda I get the following error. ` Request Method: POST

https://s5h4s4t98f.execute-api.us-west-2.amazonaws.com/dev/admin/publicidad/publicidad/add/ 2.1.1 ClientError An error occurred (InvalidToken) when calling the PutObject operation: The provided token is malformed or otherwise invalid. /var/runtime/botocore/client.py in _make_api_call, line 612 /var/lang/bin/python3.6 3.6.1 ['/var/task', '/var/runtime/awslambda', '/var/runtime', '/var/lang/lib/python36.zip', '/var/lang/lib/python3.6', '/var/lang/lib/python3.6/lib-dynload', '/var/lang/lib/python3.6/site-packages', '/var/task/setuptools-39.1.0-py3.6.egg', '/var/task'] Lun, 24 Sep 2018 00:27:27 -0500

`

mongkok commented 5 years ago

Try to downgrade the installed version, it works for me.

pip install django-storages==1.6.6
jschneier commented 5 years ago

How can I reproduce the issue?

dpretty commented 5 years ago

The issue appears to be the new behaviour of django-storages to automatically pull the AWS security token from environment variables.

AWS Lambda provides AWS_SESSION_TOKEN and AWS_SECURITY_TOKEN as environment variables, taken from the execution role for Lambda which may not be the same credentials required by django-storages for S3 access. https://docs.aws.amazon.com/lambda/latest/dg/current-supported-versions.html

I was able to fix the issue by subclassing S3Boto3Storage:

from storages.backends.s3boto3 import S3Boto3Storage

class SecurityTokenWorkaroundS3Boto3Storage(S3Boto3Storage):
    def _get_security_token(self):
        return None
dpretty commented 5 years ago

I suppose the solution needs to be a django setting e.g. AWS_SECURITY_TOKEN_IGNORE_ENVIRONMENT which, when set to True won't try and load the security token / session token from environment variables?

While AWS Lambda also populates the environment with AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, django-storages allows these to be overridden in django settings. My setup generally has these loaded from the environment, but with a different variable name (e.g. AWS_APP_ACCESS_KEY_ID). However, it doesn't feel right setting AWS_SESSION_TOKEN or AWS_SECURITY_TOKEN in django settings as these will change over time, so probably we should provide a django setting which instead allows the behaviour of loading the security token from the environment to be overridden.

jschneier commented 5 years ago

Thanks for the investigation. I’m trying to figure out the cleanest way for this to work for everyone. My use case is obviously different but it’s clear that there are a lot of different way this can be setup. As an initial step I am leaning towards partially reverting b13efd92b3bf3e9967b8e7819224bfcf9. Would you be willing to write up some docs for some ways to configure access?

A different user opened #458 which would remove all of the explicit settings which it seems would break your described use case as well.

dpretty commented 5 years ago

@jschneier certainly happy to help.

Looking at b13efd9 -- it seems that the issue had not been a problem in my case because I had explicitly set AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY in django settings , so at https://github.com/jschneier/django-storages/commit/b13efd92b3bf3e9967b8e7819224bfcf9abb977e#diff-2be7bb15fa1145a1443e92f28ed4549bL249

if not self.access_key and not self.secret_key:
  ...
  self.security_token = self._get_security_token()

was never called because self.access_key and self.secret_key had already been set.

So, for me reverting b13efd9 would fix the issue, but it doesn't fix the underlying problem -- I think other users who aren't explicitly setting AWS_ACCESS_KEY_ID or AWS_SECRET_ACCESS_KEY in django settings would see this issue prior to b13efd9. Although, probably most people would have those values set in django settings because they're listed in docs/backends/amazon-S3.rst without being specified as optional.

Whether or not my original idea of AWS_SECURITY_TOKEN_IGNORE_ENVIRONMENT is a good approach, it seems like that solution should rather be AWS_IGNORE_ENVIRONMENT_CREDENTIALS and not load any of the env variables populated by AWS Lambda (AWS_ACCESS_KEYm AWS_ACCESS_KEY_ID, AWS_SECRET_KEY, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN, AWS_SECURITY_TOKEN).

Alternatively, and probably the proper approach, is to provide permission to the S3 bucket in the AWS Lambda execution role. This might fix my issue without needing change in django-storages, and it would mean that #458 also wouldn't cause any issues. For me it would require updating quite a few production sites, which I don't mind if it's the proper approach, though I wonder how many other people are in the same boat now that Zappa is quite popular.

I'll open a PR and add a section in docs/backends/amazon-S3.rst documenting any special configuration required for AWS Lambda, but I'll do some testing with AWS Lambda execution roles first.

jordanmkoncz commented 5 years ago

I ran into this issue too, using the same services/frameworks as @davidllauce. I had AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY defined in my Django settings.py, however these were populated based on environment variables that had different names (e.g. DJANGO_AWS_ACCESS_KEY_ID).

The solution posted by @dpretty to override _get_security_token in a subclass of S3Boto3Storage worked for me, but I feel like a more straightforward solution should be provided by django-storages.

colecrtr commented 5 years ago

I was able to resolve this by setting the four public access settings seen below to False.

screen shot 2018-12-10 at 6 49 09 pm

I'm not sure all four were needed to be set to False but I'll look further into it shortly and update here.

thesunlover commented 5 years ago

The issue appears to be the new behaviour of django-storages to automatically pull the AWS security token from environment variables.

AWS Lambda provides AWS_SESSION_TOKEN and AWS_SECURITY_TOKEN as environment variables, taken from the execution role for Lambda which may not be the same credentials required by django-storages for S3 access. https://docs.aws.amazon.com/lambda/latest/dg/current-supported-versions.html

I was able to fix the issue by subclassing S3Boto3Storage:

from storages.backends.s3boto3 import S3Boto3Storage

class SecurityTokenWorkaroundS3Boto3Storage(S3Boto3Storage):
    def _get_security_token(self):
        return None

this didn't work for my case: zappa + role. Each time the request fails.

thesunlover commented 5 years ago

Can anyone say if this is connected to setting the following header as named by AWS docs:

Using Temporary Security Credentials If you are signing your request using temporary security credentials (see Making Requests), you must include the corresponding security token in your request by adding the x-amz-security-token header.

When you obtain temporary security credentials using the AWS Security Token Service API, the response includes temporary security credentials and a session token. You provide the session token value in the x-amz-security-token header when you send requests to Amazon S3. For information about the AWS Security Token Service API provided by IAM, go to Action in the AWS Security Token Service API Reference Guide .

from this page: https://docs.aws.amazon.com/AmazonS3/latest/dev/RESTAuthentication.html

Also, my research showed that AWS_SECRET_ACCESS_KEY can be different for each AWS lambda instance AWS_ACCESS_KEY_ID is the same always AWS_SESSION_TOKEN = AWS_SECURITY_TOKEN are the same always

thesunlover commented 5 years ago

I was able to resolve this by setting the four public access settings seen below to False.

screen shot 2018-12-10 at 6 49 09 pm

I'm not sure all four were needed to be set to False but I'll look further into it shortly and update here.

This is exactly how my settings are set for the bucket I have problems with. The problem still appears, but only once in a while. it may happen today, tomorrow or the next month. At the beginning I thought it was related to new versions of boto, but that wasn't the case also.

thesunlover commented 5 years ago

Is possible to be a problem with old and new credentials of the lambda-role. If the credentials are changed during the instance live span then the old credentials might not be the proper one.

thesunlover commented 5 years ago

How about adding a new custom class to the lib?

from boto3.session import Session
from storages.utils import lookup_env
from storages.backends.s3boto3 import S3Boto3Storage

class S3Boto3StorageForZappaAWSRole(S3Boto3Storage):
    """ 
    This is required, because AWS Role changes its credentials
    from 1 to 12 hours and if the Lambda instance has stored
    expired version of them we can't have a successful request
    until the Lambda Instance's live has expired
    """
    def _get_security_token(self):
        """
        Gets the security token to use when accessing S3. Get it from
        the environment variables.
        """
        return lookup_env(S3Boto3Storage.security_token_names)

    def _get_access_keys(self):
        """
        Gets the access keys to use when accessing S3. If none is
        provided in the settings then get them from the environment
        variables.
        """
        access_key = lookup_env(S3Boto3Storage.access_key_names)
        secret_key = lookup_env(S3Boto3Storage.secret_key_names)
        return access_key, secret_key

    @property
    def connection(self):
        connection = getattr(self._connections, 'connection', None)
        if connection is None:
            access_key, secret_key = self._get_access_keys()
            security_token = self._get_security_token()
            session = Session()
            self._connections.connection = session.resource(
                's3',
                aws_access_key_id=access_key,
                aws_secret_access_key=secret_key,
                aws_session_token=security_token,
                region_name=self.region_name,
                use_ssl=self.use_ssl,
                endpoint_url=self.endpoint_url,
                config=self.config,
                verify=self.verify,
            )
        return self._connections.connection
thesunlover commented 5 years ago

Or if my PR is accepted:

from storages.utils import lookup_env
from storages.backends.s3boto3 import S3Boto3Storage

class S3Boto3StorageForZappaAWSRole(S3Boto3Storage):
    def _get_security_token(self):
        return lookup_env(S3Boto3Storage.security_token_names)

    def _get_access_keys(self):
        access_key = lookup_env(S3Boto3Storage.access_key_names)
        secret_key = lookup_env(S3Boto3Storage.secret_key_names)
        return access_key, secret_key
codenamesubho commented 5 years ago

Hi I am facing the same issue even when I have AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY defined in my django settings, works perfectly on my local machine, fails when run in lambda. Started working after downgrading from django_storage 1.7.1 to django-storages==1.6.6.

guilhermej commented 5 years ago

I'm facing the same error using Django + Zappa + S3 Storages I can get this work only by removing the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY from my settings.py

pulsedemon commented 4 years ago

The issue appears to be the new behaviour of django-storages to automatically pull the AWS security token from environment variables.

AWS Lambda provides AWS_SESSION_TOKEN and AWS_SECURITY_TOKEN as environment variables, taken from the execution role for Lambda which may not be the same credentials required by django-storages for S3 access. https://docs.aws.amazon.com/lambda/latest/dg/current-supported-versions.html

I was able to fix the issue by subclassing S3Boto3Storage:

from storages.backends.s3boto3 import S3Boto3Storage

class SecurityTokenWorkaroundS3Boto3Storage(S3Boto3Storage):
    def _get_security_token(self):
        return None

This worked for me. My requirements.txt:

boto3==1.14.8
Django==3.0.7
djangorestframework==3.11.0
django-storages==1.9.1
Pillow==7.1.2
psycopg2-binary==2.8.5
sorl-thumbnail==12.6.3
zappa==0.51.0

Thanks @dpretty

SamiUrias commented 4 years ago

It works for me using the tip that @guilhermej said. I am going to use a different settings.py file for local developement and for production environment. Maybe that could be an option

eba-alemayehu commented 3 years ago

The issue appears to be the new behaviour of django-storages to automatically pull the AWS security token from environment variables.

AWS Lambda provides AWS_SESSION_TOKEN and AWS_SECURITY_TOKEN as environment variables, taken from the execution role for Lambda which may not be the same credentials required by django-storages for S3 access. https://docs.aws.amazon.com/lambda/latest/dg/current-supported-versions.html

I was able to fix the issue by subclassing S3Boto3Storage:

from storages.backends.s3boto3 import S3Boto3Storage

class SecurityTokenWorkaroundS3Boto3Storage(S3Boto3Storage):
    def _get_security_token(self):
        return None

Where do I put this code in django?

Brandonza commented 3 years ago

Where do I put this code in django?

Did you find a solution to this?

hongdoojung commented 3 years ago

removing AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY solve this problem in my case.

jkrishna2511 commented 3 years ago

removing AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY solve this problem in my case.

if these attributes removed from settings how would django-storages access s3.

Abishek05 commented 3 years ago

The issue appears to be the new behaviour of django-storages to automatically pull the AWS security token from environment variables.

AWS Lambda provides AWS_SESSION_TOKEN and AWS_SECURITY_TOKEN as environment variables, taken from the execution role for Lambda which may not be the same credentials required by django-storages for S3 access. https://docs.aws.amazon.com/lambda/latest/dg/current-supported-versions.html

I was able to fix the issue by subclassing S3Boto3Storage:

from storages.backends.s3boto3 import S3Boto3Storage

class SecurityTokenWorkaroundS3Boto3Storage(S3Boto3Storage):
    def _get_security_token(self):
        return None

I have added this code in custom_storages.py and it doesn't work. Where should I add this code?

pratiksinghchauhan commented 2 years ago

The issue appears to be the new behaviour of django-storages to automatically pull the AWS security token from environment variables. AWS Lambda provides AWS_SESSION_TOKEN and AWS_SECURITY_TOKEN as environment variables, taken from the execution role for Lambda which may not be the same credentials required by django-storages for S3 access. https://docs.aws.amazon.com/lambda/latest/dg/current-supported-versions.html I was able to fix the issue by subclassing S3Boto3Storage:

from storages.backends.s3boto3 import S3Boto3Storage

class SecurityTokenWorkaroundS3Boto3Storage(S3Boto3Storage):
    def _get_security_token(self):
        return None

I have added this code in custom_storages.py and it doesn't work. Where should I add this code?

You need to add this in whatever file you are mentioning in your settings file as storage backend.

@dpretty you saved the day!

jschneier commented 2 years ago

Hi all. I just re-read through this thread. It seems the main thing is we don't want to automatically pull the security token because it can be wrong sometimes. However, other users of Lambda rely on pulling the security token.

Does anyone have a link to documentation about why there are competing use-cases? We can add another storage or a setting once I understand the root problem.

vinodkr494 commented 2 years ago

The issue appears to be the new behaviour of django-storages to automatically pull the AWS security token from environment variables.

AWS Lambda provides AWS_SESSION_TOKEN and AWS_SECURITY_TOKEN as environment variables, taken from the execution role for Lambda which may not be the same credentials required by django-storages for S3 access. https://docs.aws.amazon.com/lambda/latest/dg/current-supported-versions.html

I was able to fix the issue by subclassing S3Boto3Storage:

from storages.backends.s3boto3 import S3Boto3Storage

class SecurityTokenWorkaroundS3Boto3Storage(S3Boto3Storage):
    def _get_security_token(self):
        return None

thanks , it work for me

vinodkr494 commented 2 years ago

i was able to fix by adding custom storages.py

from django.conf import settings
from storages.backends.s3boto3 import S3Boto3Storage

class SecurityTokenWorkaroundS3Boto3Storage(S3Boto3Storage):
    def _get_security_token(self):
        return None

class MediaStorage(SecurityTokenWorkaroundS3Boto3Storage):
    location = settings.MEDIAFILES_LOCATION

class StaticStorage(SecurityTokenWorkaroundS3Boto3Storage):
    location = settings.STATICFILES_LOCATION
chupert91 commented 1 year ago

I found this to be a policy issue within AWS. I erased my original policy, recreated it with the policy generator, then pasted. Seemed to do the trick for me. Also the setting AWS_QUERYSTRING_AUTH = False in the settings.py file may help.

michaelhenry commented 7 months ago

The above solution won't work anymore as S3Boto3Storage has been totally refactored and moved to S3Storage. The problem with using lambda is AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and other keys are reserved. So the workaround i did is just simply set the credentials using the custom OPTIONS and never used those reserved keys.

For example:

STORAGES = {
    "staticfiles": {
        "BACKEND": "storages.backends.s3.S3Storage",
        "OPTIONS": {
            "access_key": ENV("APP_S3_ACCESS_KEY"),
            "secret_key": ENV("APP_S3_SECRET_KEY"),
            "bucket_name": ENV("APP_S3_BUCKET_NAME"),
        },
    },
}