aws-amplify / amplify-cli

The AWS Amplify CLI is a toolchain for simplifying serverless web and mobile development.
Apache License 2.0
2.8k stars 820 forks source link

Can we use the sub instead of identity id for private/protected storage? #1847

Open rawadrifai opened 4 years ago

rawadrifai commented 4 years ago

Which Category is your question related to? Amplify Storage

What AWS Services are you utilizing? S3, Cognito

Provide additional details e.g. code snippets Right now when saving a private or protected object in S3 using Amplify, we get the user identity id in the folder name, instead of identity id. When you want to access other users' objects, you need to know their identity id. In the post confirmation trigger in Cognito, we get the user sub only. We get the identity id when the user logs in.

I am currently resorting to updating the identity id in my dynamodb on log in every time. Is there a reason you are using the identity id instead of sub? I think sub would streamline it even better.

rawadrifai commented 4 years ago

On the same note, I'm actually unable to track down the identity id anywhere. I'm confused, I thought identity id is used for federated identities and sub is used for cognito identities.

ajhool commented 4 years ago

The way this was handled in the @auth directive was nice. An identifyField was added to the @auth directive that allowed for backward compatibility and made it easy to update to identityField: sub for those who wanted to.

kaustavghosh06 commented 4 years ago

@rawadrifai AFAIK, S3 doesn't support cognito user-pool level auth protection and hence we use identity-pools and corresponding IAM rules against it. Technically, you wouldn't need cognito user-pools to authenticate against S3.

ajhool commented 4 years ago

Isn't this S3 sub auth example what @rawadrifai has in mind? (https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_examples_s3_cognito-bucket.html)?

I've made a similar issue in the past on amplify-js: https://github.com/aws-amplify/amplify-js/issues/1787

rawadrifai commented 4 years ago

Yea I think we just have to get creative with policies, not the easiest thing. It would have been great if S3 took on auth like Dynamo does.

@ajhool can you think of a way to control dynamic group access to s3 objects? For example, a private channel that has images. Only the members of the channel may access the images. If someone leaves the channel they may not access objects anymore.

ajhool commented 4 years ago

@rawadrifai

I think that you might be conflating two similar but separate issues.

Concern 1: Use the sub field instead of the cognitoIdentityId field.

Concern 2: Mimic the @auth API directive for Storage, which includes Read/Write access control and things like groups (eg. admin).

These are separate concerns with different degrees of difficulty....

Concern 1, I 100% agree with and think that it could be implemented very quickly. Currently Amplify uses cognitoIdentityId, which I believe is incorrect -- it should use the sub field.

The API team had a similar issue, so they added an identityField to the @auth directive to maintain backwards compatibility. If you set the identityField = sub then Storage would just create paths to the sub field and grant access to users using IAM sub prefix filtering, as demonstrated here: https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_examples_s3_cognito-bucket.html . This is the best practice. I'm not sure what cognitoIdentityId is or why it was chosen, I think that it is just a mistake. Amplify would need to check the identityField before autogenerating the IAM roles and a client-side switch would be needed, too.

Concern 2 is more complex but still possible. To implement this, amplify would keep the same IAM :sub strategy described in Concern 1, but if more complex rules are specified, then amplify would create a lambda function that would perform the authorization logic and return a presigned url that would grant the user access, which the user could then use to retrieve the Storage file. Amplify's current exposed Storage AP wouldn't need to change much to implement this, but the underlying logic would take a little while to implement.

The logic would look something like this (note: this is very rough pseudocode and I didn't look up the actual API call names/input for the presignedUrls, but this should give you the idea of how it works). There are other complexities like expiration dates and cacheing.

// config
...
  storage: {
     auth: {
         read: [owner, groups: [''admin', 'editor', 'auditor', 'authenticaticated'],
         write:  [owner, groups: [''admin', 'editor']
     }
  }
...
// very rough client side pseudocode
// We don't need to use the lambda function if the user owns the file -- this permission is taken care of by the IAM role with the :sub prefix.
if(_userOwnsFile(s3Key)) {
  if(operation === 'PUT') putFileToS3(file, s3Key);
  if(operation === 'GET') getFileFromS3(s3Key)
} else {
  const presignedS3Url = await invokeAuthorizationLambda(s3Key, operation);
  if(operation === 'PUT') putFileToS3Url(file, preSignedS3Url);
  if(operation === 'GET') getFileFromS3Url(preSignedS3Url);
}
// invokeAuthorizationLambda server side logic:

handler = ( {key, operation }, context, callback ) => {
  const userGroups = cognitoGetGroups(context.identity.sub);

  // Amplify would autogenerate this below part, somehow:
  // begin auth parsing
  const groupsWithWritePermissions = ['admin', 'editor'];
  const groupsWithReadPermissions = [''admin', 'editor', 'auditor', 'authenticated'];
  // end auth parsing

  if(operation === 'GET') {
     if(  groupsWithReadPermissions.containsAny(userGroups) ) {
        return AWS.S3.getPresignedUrl('GET', key);
     } else {
        return '401 not authorized';
      } 
  }

  if(operation === 'PUT') {
     if(  groupsWithWritePermissions.containsAny(userGroups) ) {
        return AWS.S3.getPresignedUrl('PUT', key);
     } else {
        return '401 not authorized';
      } 
  }
}
ajhool commented 4 years ago

Perhaps a really powerful way to add @auth to Storage would be to use Amplify.API to create an AppSync StorageAPI that could be used to organize S3 files and add some higher-level file systems like folders/albums. The @auth directive in API would generate the authorization logic and fine-grained access control to the presigned tokens, which would be generated in lambda resolvers.

I'm not sure whether you can currently protect function resolvers with the @auth directive and you would need function resolvers to generate the presigned tokens, but it's worth looking into.

This would enable OnFileCreate Subscriptions and some other neat use case. It might create some duplicate source of truth issues, but it would also bring benefits.

rawadrifai commented 4 years ago

For concern 1: I agree it sounds like a quick fix. For concern 2: Great suggestion. The most common use case will be to store a reference for an S3 object in some database (DynamoDB). Then you access that ref and query your object. Most of the scenarios will require that the object obeys the same auth rules as its reference.

ajhool commented 4 years ago

(per concern 1) I looked through the code a little bit and am really confused. The IAM policy that amplify generates actually looks like it is using the sub (see below)

// myprojectname-myenv-20190514522720-authRole
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "s3:PutObject",
                "s3:GetObject"
            ],
            "Resource": [
                "arn:aws:s3:::mystoragename-1557894786-myenv/private/${cognito-identity.amazonaws.com:sub}/*"
            ],
            "Effect": "Allow"
        }
    ]
}

However, the prefix used in S3 is the Cognito Federated Identity Id, not the userpool sub. I'm not sure how to use the userpool sub to retrieve the Cognito Identity Id. I'm also surprised that the cognito-identity.amazonaws.com:sub prefix expects the cognito identity id.

rawadrifai commented 4 years ago

@kaustavghosh06 are you planning on switching to sub instead of identity id for this?

RossWilliams commented 4 years ago

I’d much rather have a choice of claim to use and I’d like to avoid sub as it’s not under my control and cannot be recreated when restoring a Cognito User Pool. Anything that relies on a sub would be a large operations risk and require a lot of tooling and time to overcome.

rawadrifai commented 4 years ago

I can agree to that too.

On Sun, Aug 4, 2019 at 3:35 AM Ross Williams notifications@github.com wrote:

I’d much rather have a choice of claim to use and I’d like to avoid sub as it’s not under my control and cannot be recreated when restoring a Cognito User Pool. Anything that relies on a sub would be a large operations risk and require a lot of tooling and time to overcome.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/aws-amplify/amplify-cli/issues/1847?email_source=notifications&email_token=AF3E5QJFGM4VZXXYM33ZT33QC2WIRA5CNFSM4IEK6ZMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3P65MA#issuecomment-517992112, or mute the thread https://github.com/notifications/unsubscribe-auth/AF3E5QP2TOWJ3TLKCQGEL5LQC2WIRANCNFSM4IEK6ZMA .

undefobj commented 4 years ago

@rawadrifai we took a few hours and read through the feedback here. I think there is a bit of a mismatch on which sub is being looked at leading to the questions above, which primarily stems from Cognito having two components - User Pools and Identity Pools - which both have a sub.

The links to the policy documentation is actually what Amplify does today which you can see HERE. This is the sub from Cognito Identity and is what S3 policy variable supports.

The sub from User Pools, which is what you are seeing in the Post Confirmation Trigger, is not supported in an IAM policy variable. I do understand why you would like to use this though as it's the friendly username and as you have a use case of getting data belonging to other users.

We are currently talking to the Cognito team on ways to join these systems up automatically on the backend (the issue is being tracked at https://github.com/aws-amplify/amplify-js/issues/54), however one possibility being discussed is to have the Amplify client automatically update a User Pool attribute with the IdentityID from Identity Pools once federation has taken place. I don't want to commit to this implementation yet though as we need to explore the security posture of such a client side implementation. However that would allow you to know the mapping from one user's friendly name to their S3 contents (such as sharing a private album). Additionally, we're working on the final pieces of the Admin Tasks RFC (https://github.com/aws-amplify/amplify-cli/issues/766) which will include Dynamic IAM roles based on Cognito Groups or Token rules, so you could combine the two to grant access to different application user groupings.

No easy answer to this problem at the moment, but we are looking at ways to address it.

undefobj commented 4 years ago

I’d much rather have a choice of claim to use

@RossWilliams can you be more specific here in what you're looking for? Do you mean that based on a specific Token Claim (e.g. equality test, contains, etc.) then you would grant access to a specific S3 bucket? If so then that will be possible when we implement Dynamic IAM roles from Cognito Groups or Token claims in RFC https://github.com/aws-amplify/amplify-cli/issues/766

RossWilliams commented 4 years ago

I only need an equality check, but I don’t want to be forced to use the “sub” token claim. Imagine I have an “AppNameUserId” claim in my token

baharev commented 4 years ago

@RossWilliams Wise words. You might be interested in an approach @lestephane sketched in his comment. Although my use case is significantly different from his, it turns out that after restructuring my application a similar approach works for me too.

24jr commented 4 years ago

Is there any update on this. I'm running into same issue. I have all the users ID's as their Cognito sub so that of course would be convenient to use but after reading it seams we need some other identityID.

The docs say "When using Auth and Storage modules together, you do not need to construct the /{user_identity_id}/ manually as the library will use the configured Cognito Identity ID for your user/device along with the configured access level for an action. This includes UnAuthenticated access where you will first call Auth.currentCredentials() before a Storage action. See Authentication for more information."

Idk if thats some update on how to work around this. Idk did all these other people just give up or what. Should I just make public.

Anyone know the answer? And if the answer is like go into these backend tools and create this, that, and the other...well I'm prob not that advanced but I suppose any suggestion would help many people

dtelaroli commented 3 years ago

+1

idanlo commented 3 years ago

Over a year and no solution? This is frustrating. As far as I know, there is literally no way of getting a user's identity id using some sort of lambda (that is called automatically, not by the user), meaning only the user itself can modify private S3 files

waltermvp commented 3 years ago

@idanlo i don’t have an answer for you but possibly a new approach.

When a new user signs up I use the post confirmation trigger to save that user to dynamodb (o make sure to save the sub + identityID). That way I can fetch the user using ID / email or anything else then have access to that identityID that I need when modifying S3 in a lambda

idanlo commented 3 years ago

@waltermvp That's what I do as well, but this means a user must be created in your application (and not using an adminCreateUser call for example) because otherwise you will not have access to the identity id for that user.

waltermvp commented 3 years ago

Correct but this way you can have what you need in a lambda (just give the lambda access to your API using the CLI)

dtelaroli commented 3 years ago

I could get in lambda the identityId from cognito using jwt token and cognito sdk. You should use @function with @aws_cognito_user_pools credentials or send jwt token in another way.

https://gist.github.com/dtelaroli/78f8a17afceed3e7d398929f0fbbf950

ghost commented 3 years ago

For those of us who need an adminCreateUser flow, I'm still in the dark as to how to reliably store a dynamodb mapping from user pool sub to identity id. This becomes even more relevant if you want to attach IOT Policy to identities when users are provisioned by an admin. Any assistance here would be appreciated.

ansuman87 commented 3 years ago

@idanlo i don’t have an answer for you but possibly a new approach.

When a new user signs up I use the post confirmation trigger to save that user to dynamodb (o make sure to save the sub + identityID). That way I can fetch the user using ID / email or anything else then have access to that identityID that I need when modifying S3 in a lambda

Hi @waltermvp @idanlo ,

I have also tried to do something similar. But my post-confirmation trigger event only contains the 'sub' field and not 'cognitoIdentityId'. For the latter, I presume, we need to access the identity pool. I tried by using AmazonCognitoIdentity from 'amazon-cognito-identity-js'. I managed to get a cognitoUser corresponding to the logged-in user. But it only contains the sessionToken and not the cognitoIdentityId. The next step was to completely establish a user session that would require the password of the user, which I can't have access to (https://developer.aliyun.com/mirror/npm/package/amazon-cognito-identity-js, Use case 4, is what I tried to implement) If you have managed to get the cognitoIdentityId, without the need of user password, can you share your solution with me. I feel like I have hit the dead-end. Thank you in advance.

/ansuman

ghost commented 3 years ago

@ansuman87 - I think the difference here is that @waltermvp is using a self-sign up flow. It sounds like you're using the admincreateuser workflow (which is what I'm also using). Until I have a more appropriate solution, I am now thinking of creating an api call to be used after login (either every login, or first login if I can work out a good way to know this in the UI). The api will call a lambda that persists the identityId. I dont like this answer - but cant think of a better approach.

ansuman87 commented 3 years ago

@ansuman87 - I think the difference here is that @waltermvp is using a self-sign up flow. It sounds like you're using the admincreateuser workflow (which is what I'm also using). Until I have a more appropriate solution, I am now thinking of creating an api call to be used after login (either every login, or first login if I can work out a good way to know this in the UI). The api will call a lambda that persists the identityId. I dont like this answer - but cant think of a better approach.

@jules-ofs I am also using a self-signup flow. I have two use cases, one is via Alexa (account linking) and the other is via Android self-hosted UI (in both cases the back-end handles the password). But I have seen examples, all Javascript based, for web platform, where the password is handled in the back-end by the developer. I still don't know if it is possible to have access to the user password in a post confirmation trigger. I suppose not. That's why I think my approach is not right and there has to be some other way. Coming back to your solution. I am not very clear what you mean by creating an API after login. Isn't that the purpose of post-confirmation and post-authentication triggers? Do you mean to call these APIs from the trigger lambdas?

ghost commented 3 years ago

@ansuman87 not quite. I'm assuming you're using federated identity with a cognito identity pool linked to a cognito user pool. The key here is that you cant access the identityId until after they login. I'm thinking the flow goes:

  1. User logs in to the web app. At this point UI has a idToken, accessToken, refreshToken, and can also request and a set of credentials with their identityId.
  2. web app calls a mapIdentity API call - which is handled by a lambda. Web passes through the idToken. the idToken is from the identity provider (in this case user pool) and contains the userid. It does not contain the identity id.
  3. mapIdentity checks to see if it has already saved a mapping from userId to identityId. If not:
  4. mapIdentity Lambda verifies the token (or the API can verify the token via a cognito authorizer)
  5. mapIdentity Lambda requests identityId by calling CognitoIdentity.getId - which it can do, because it has a valid idToken.
  6. mapIdentity Lambda saves a mapping from userId to identityId.

There is alternative and possibly simpler workflow if you're calling the API with the credentials rather than the idToken, as this post suggests you can extract both the identityId and the userId from within a lambda.

I'd appreciate any forum feedback on the security of this approach.

baharev commented 3 years ago

@ansuman87 Have you seen this workaround?

[How to find the bidirectional map between Cognito identity ID and Cognito user information?

54 ](https://github.com/aws-amplify/amplify-js/issues/54#issuecomment-434401406)

ansuman87 commented 3 years ago

@ansuman87 Have you seen this workaround?

How to find the bidirectional map between Cognito identity ID and Cognito user information? #54

@baharev Thanks for pointing it out for me. I will have to try it out first. Seems like a very roundabout way. But I think for this issue all the solutions so far are thus. On a second thought, I think the solution is for Admin-created-users flow. Mine is a self-signup flow. My exact use-case is that I want to access (and create one if already not present) the cognitoIdentittyId using a post-confirmation lambda trigger after the user signs-up. I want to save the cognitoIdentityId as the primary key in a dynamoDB table and save some of the user related information specific to the application in the table. The reason why I want this is so that I can set the row-level query condition in the IAM role for Authenticated Identity Pool, where cognitoIdentityId has to be primary key.

baharev commented 3 years ago

@ansuman87

Seems like a very roundabout way. But I think for this issue all the solutions so far are thus.

Yes, as far as I know, there is no clean solution to this issue. At least not yet.

On a second thought, I think the solution is for Admin-created-users flow. Mine is a self-signup flow.

Nope. I think that workaround works for both flows just fine. Don't get me wrong, it is a horrible workaround, but that's the best we can currently do, and it is the "official" workaround by AWS.

dorontal commented 3 years ago

@RossWilliams about

I’d much rather have a choice of claim to use and I’d like to avoid sub as it’s not under my control and cannot be recreated when restoring a Cognito User Pool. Anything that relies on a sub would be a large operations risk and require a lot of tooling and time to overcome.

I don't understand the concern about restoring a Cognito User Pool - are you saying that because your code / app is currently using identityId you'd have to do a lot of work if Amplify switched that over to sub?

[ IMO this would not be a good enough reason not to switch to sub, considering the problems identityId always causes most Amplify developers, which are bigger problems than restoring a Cognito User Pool when converting to sub ]

dorontal commented 3 years ago

Editing this comment months later because it is now the opposite...

CORRECTION: the workaround that used to work above no longer works! @JesseDavda's solution no longer works because when you edit the Auth role's trust relationship to set up "attributes for access control" and map cognitoId->sub that way, Cognito no longer lets you sign in now and barfs with "failed to federate the token" error. You can no longer do this workaround!

Back to square 0.

@undefobj or anybody at AWS, is there any progress on getting the solution that you mentioned on the Cognito end implemented?

The use case of having a user get S3 data of other users is very common. Think of a blogging app (or YouTube), for example, where the user searches for a topic and sees a list of documents, where each entry has an image for the owner and that image is on S3.

This was the original comment, the one that used to work but no longer does:

[ leaving this incorrect comment here for reference ]

This problem -- the problem of using the user's sub instead of their identityId in S3 paths and how to set that up -- is 100% solved for me thanks to the help of @JesseDavda's solution to this problem in this issue: https://github.com/aws-amplify/amplify-js/issues/54

For all of you developers who have been trying to get the identityId in lambdas so that your Amplify default paths in S3 work - this solution simply ends up ignoring identityId altogether - it is a solution that sets up the paths in S3 based on sub instead of the identityId. At the end of this solution, you will never have to deal with more than one id for your users, you will never have to deal with identityId (hopefully) ever again.

dorontal commented 2 years ago

This issue still stands and the workaround for it no longer works.

Because the workaround originally proposed by Jesse Davda, that I outline above in steps 1-4, no longer works, I opened this new issue to hopefully get the workaround to work again.

AleksandarGT commented 5 months ago

After 5 years this is still an issue. Amplify team really doesn't listen to what developers want.