aws-amplify / amplify-category-api

The AWS Amplify CLI is a toolchain for simplifying serverless web and mobile development. This plugin provides functionality for the API category, allowing for the creation and management of GraphQL and REST based backends for your amplify project.
https://docs.amplify.aws/
Apache License 2.0
89 stars 76 forks source link

Erroneous/Inconsistent value for context.source.id #495

Closed naedx closed 2 years ago

naedx commented 2 years ago

Before opening, please confirm:

How did you install the Amplify CLI?

npm

If applicable, what version of Node.js are you using?

14.18.2

Amplify CLI Version

8.3.1

What operating system are you using?

Windows 11

Did you make any manual changes to the cloud resources managed by Amplify? Please describe the changes made.

No manual changes made

Amplify Categories

auth, function, api

Amplify Commands

Not applicable

Describe the bug

I have a model Project that has (at least) two subfields tasks and siteLocations.

Whenever I execute a GetProject query and select .tasks.items.* my query returns the two entries as expected. However, the query returns null for both subfields if I also include .siteLocations.items.* in the selection set.

Upon examining the cloudwatch logs I can see that when I do not select the extra subfield the context.source.id and the resulting DDB query template are correct. However, when I include the other subfield I can see that context.source.id has an erroneous value halfway through the execution of the pipeline and the resulting DDB query template reflects this erroneous value.

I elaborate on this in Additional Information.

Expected behavior

  1. Adding a subfield to the selection set should not cause another subfield to fail.
  2. The context.source.id field (and everything else in the source) should remain consistent througout the invocation of the pipeline at that same level of nesting.

Reproduction steps


query MyQuery1_succeeds {
  getProject(id: "d90758f8-e2f2-4f58-ac39-bdec31e36e44") {
    tasks {
      items {
        id
        name
      }
    }
  }
}

query MyQuery2_fails {
  getProject(id: "d90758f8-e2f2-4f58-ac39-bdec31e36e44") {
    tasks {
      items {
        id
        name
      }
    }
    siteLocations {
      items {
        id
        name
      }
    }
  }
}

MyQuery1_succeeds returns the two values as expected for in tasks.items. MyQuery2_fails returns null for tasks.items (and for siteLocations)

GraphQL schema(s)

```graphql # Put schemas below this line type Project @model( subscriptions: null, timestamps: {createdAt: "createdAt", updatedAt: "updatedAt"}) @auth(rules: [ { allow: groups, groupsField: "id", groupClaim: "customGroupClaim", operations: [create, read, update, delete] } ]) { id: ID! tasks: [Task] @hasMany(indexName: "byProjectIdBySortOrder", fields: ["id"]) siteLocations: [SiteLocation] @hasMany(indexName: "byProjectIdBySortOrder", fields: ["id"]) } type Task @model( subscriptions: null, timestamps: {createdAt: "createdAt", updatedAt: "updatedAt"} ) @auth( rules: [ { allow: groups, groupsField: "projectId", groupClaim: "customGroupClaim", operations: [create, read, update, delete] } ]) { id: ID! @primaryKey(sortKeyFields: ["projectId"]) projectId: ID! @index(name: "byProjectIdBySortOrder", sortKeyFields: ["sortOrder"]) name: String description: String sortOrder: Int } type SiteLocation @model( subscriptions: null, timestamps: {createdAt: "createdAt", updatedAt: "updatedAt"} ) @auth(rules: [ { allow: groups, groupsField: "projectId", groupClaim: "customGroupClaim", operations: [create, read, update, delete] } ]) { id: String! @primaryKey(sortKeyFields: ["projectId"]) projectId: ID! @index(name: "byProjectIdBySortOrder", sortKeyFields: ["sortOrder"]) name: String address: String sortOrder: Int } ```

Log output

``` # Put your logs below this line ```

Additional information

I have checked the cloudwatch logs for the query and made the following observations:

I know the source of the extraneous guid "5fb82c2b-a371-4a15-a41d-81ee6a991469" seen above. It corresponds to another Project that the user has access to as indicated by a custom group claim (which I describe below). I just don't know why it is added into the source object.

Special Modifications:

I have added a custom cognito claim that returns a list of ProjectIds that the user has access to. I reference this as the customGroupClaim (as seen in the schema provided). I have therefore overriden the Query.getProject.auth.1.req.vtl to ensure this value is treated as a list (#featurerequest :) ). This results in the authFilter like this:


  "context": {
      "stash": {
           "authFilter": {
                "or": [
                    {
                        "id": {
                            "in": [
                                "e0cb39cb-3685-4dd6-b7b8-8d5fc093c64e",
                                "5fb82c2b-a371-4a15-a41d-81ee6a991469"
                            ]
                        }
                    }
                ]
            }
       }
  }

My Hypothesis:

While it seems farfetched it feels like context.source.id has been set to whatever value was provided as the id to filter on. In my case this is a list. When I modify my system to return a list of only one item the erroneous context.source.id in the logs shows a stringified list containing one string. I modified the filter to use eq: string instead of in: [string] but still end up with the stringified list containing one string.

Other: I upgraded from amplify cli 8.2.0 to 8.3.1 to see if that would solve the problem but it didn't.

ykethan commented 2 years ago

Hey @naedx, Thank you for reaching out. Could you please test the following in the AppSync console and let us know if you are observing an empty array in the result.

  1. Create and run a createproject mutation.
  2. Create and run a createSiteLocation mutation with the sortOrder, for example:

image

  1. Run a getProject query with the SiteLocation items. For example:

image

On replicating this I was able to see the following output. image

josefaidt commented 2 years ago

Hey @naedx :wave: to add on to @ykethan 's comment above, what is the motivation behind setting the groupClaim on the ID field rather another field? Or what is the use case you're looking to accomplish?

naedx commented 2 years ago

Hey @ykethan and @josefaidt. Thank you for looking into this.

@ykethan, I'm going to retry as you've suggested and let you know by next Tuesday (7/06). I'll further add more aspects from my setup if it doesn't manifest with this alone. (My true and complete setup where the problem is manifesting contains about 13 models and 50 types at the moment as well as some lambda functions.)

Based on what you have proposed, my setup should return the results as you see. But, but it would return null for .siteLocations and .tasks only after .tasks is added.

naedx commented 2 years ago

Hey @naedx 👋 to add on to @ykethan 's comment above, what is the motivation behind setting the groupClaim on the ID field rather another field? Or what is the use case you're looking to accomplish?

The use case is that I want to be able to authorize, and deauthorize users from the a Project and all its children in a multi-tenant setup.

If I were to use Owner Authorization I would have to add the owner: [String!] to the project and every child type, and object and update each one when then users are added/removed.

I have used the ID field since this uniquely identifies the Project and all the children bear this as an index. That way, no object can be accessed without the user possessing a claim on the project.

Is there a better approach to achieve this?

ykethan commented 2 years ago

Hey @naedx, I would suggest looking into custom attributes for Amazon Cognito user pools where we can set the record’s ID as a value in a custom Cognito attribute. Please refer to the following issues providing additional information and examples. https://github.com/aws-amplify/amplify-category-api/issues/28#issuecomment-1132978133 https://github.com/aws-amplify/amplify-category-api/issues/64

Please do let us know if this helps with your use case.

naedx commented 2 years ago

Hey @ykethan , thanks for following up and referencing those threads.

If I understand your suggestion correctly, I think this is already the approach I've implemenented in my schema with customGroupGlaim:

  @auth(rules: [
    { 
       allow: groups
       groupsField: "id"
       groupClaim: "customGroupClaim"
       operations: [create, read, update, delete] 
     }
  ])

..and I set the customGroupClaim using a cognito custom pre-token trigger to set customGroupClaim to a list of IDs that the user should be able to access.

At the moment, I've observed (that in some generated templates, but not others) the generated template expects a single string where in other templates it handles string and/or list of strings. For this I have had to override the generated resolver from:


#set( $idClaim = $util.defaultIfNull($ctx.identity.claims.get("customGroupClaim"), "___xamznone____") )

to:


    #set( $idClaim = $util.defaultIfNull($ctx.identity.claims.get("customGroupClaim"), []) )

    #if( $util.isString($idClaim) )
      #if( $util.isList($util.parseJson($idClaim)) )
        #set( $idClaim = $util.parseJson($idClaim) )
      #else
        #set( $idClaim = [$idClaim] )
      #end
    #end

    #set( $ownerClaimsList0 = $idClaim )

I think my use case is similar to the comment @ Issue #64:

Thanks, @josefaidt .

Point of clarification: In our use case, company_id is used for users who should have access to only a single company. company_access will be an array of IDs, for a type of user who should have access (and slightly different levels of access) to multiple companies.

Which is perhaps then foiled by the related issue referenced here:

Great callout, while this isn't affecting the immediate results, I'm curious if this is tangentially related to #103 where the claim is an array of strings.

And I'm almost certain this is related to #65

naedx commented 2 years ago

I think the issue of the erroneous $ctx.source.id is caused by what is described here in the comment on Issue #64:

Hopped onto a call with @lseemann, after digging deeper, we found that the offending code in the resolver was because there is a line of code that updates the $ctx.source.<fieldname> to $primaryRoleX.

Have created a PR aws-amplify/amplify-cli#10197.

I can see the offending code in my the default generated templates:

#if( $util.authType() == "User Pool Authorization" )
  #if( !$isAuthorized )
    #set( $primaryRole0 = $util.defaultIfNull($ctx.identity.claims.get("customGroupClaim"), "___xamznone____") )
    #set( $ownerClaimsList0 = [] )
    #if( (!$util.isNull($ctx.source.id)) && (($ctx.source.id == $primaryRole0) || $ownerClaimsList0.contains($ctx.source.id)) )
      #set( $isAuthorized = true )
      $util.qr($ctx.stash.put("authFilter", null))
    #else
      #if( !$isAuthorized && $util.isNull($ctx.stash.authFilter) )
        $util.qr($ctx.source.put("id", $primaryRole0))
        #set( $isAuthorized = true )
      #end
    #end
  #end
#end

Specifically, at $util.qr($ctx.source.put("id", $primaryRole0)).

In overriding my templates to handle the list of strings I had replaced this code but it seems I have other subfields that I hadn't overriden yet that still contained it.

Update: in testing further with other subfields I can see that whenever my query fails due to $context.source.id having the incorrect value as described here, commenting out or removing the line $util.qr($ctx.source.put("id", $primaryRole0)) resolves the issue.

ykethan commented 2 years ago

Hey @naedx, Thank you for the information. On reaching out to the team and discussing your use case, the team suggests separating the groups as a separate field as combining Multi tenant and groups on a ID may cause unintended behaviour in your application.

For example:

type Todo @model @auth(rules: [{ allow: groups, groupsField: "groups" }]){
  id: ID! @primaryKey
  name: String! @index(name: "byName", queryField: "todoByName")
  description: String
  groups: [String]
}

here we separate the groups and the ID field allowing us to logically separate the items that can be retrieved byname and if present in a specific group

naedx commented 2 years ago

Hey @ykethan, thanks to you and the team for the reply! Indeed, changing from using the key field to another field generates the authZ rules that I need.

That addresses my use case sufficiently. I only now need to set this tenantId on all existing records and all new going forward.

Changing my schema to follow this model: ```js type Todo @model @auth(rules: [{ allow: groups, groupsField: "tenantId", groupClaim: "customGroupClaim" }]) { id: ID! @primaryKey tenantId: String! name: String! @index(name: "byName", queryField: "todoByName") description: String } ``` Generates the following: ```javascript #if( $util.authType() == "User Pool Authorization" ) #if( !$isAuthorized ) #set( $authFilter = [] ) #set( $role0 = $util.defaultIfNull($ctx.identity.claims.get("customGroupClaim"), []) ) #if( $util.isString($role0) ) #if( $util.isList($util.parseJson($role0)) ) #set( $role0 = $util.parseJson($role0) ) #else #set( $role0 = [$role0] ) #end #end #if( !$role0.isEmpty() ) $util.qr($authFilter.add({ "tenantId": { "in": $role0 } })) #end #if( !$authFilter.isEmpty() ) $util.qr($ctx.stash.put("authFilter", { "or": $authFilter })) #end #end #end ```
ykethan commented 2 years ago

Hey @naedx, Glad to hear that you are back up and running. Please do let us know if we can go on ahead and close the issue.

naedx commented 2 years ago

Thanks for the follow up, @ykethan . The original issue no longer manifests if I set auth to use a non-key field. You may close this on that basis.

(Overwriting the source.id based on the selection set still seems like an odd behavior though.)

ykethan commented 2 years ago

Thank you for the information, I will be closing the closing the issue. please feel free in reaching out to us again.