aws-amplify / amplify-category-api

The AWS Amplify CLI is a toolchain for simplifying serverless web and mobile development. This plugin provides functionality for the API category, allowing for the creation and management of GraphQL and REST based backends for your amplify project.
https://docs.amplify.aws/
Apache License 2.0
89 stars 79 forks source link

Add operationName to generateGraphQLDocument for use in custom authorization #2805

Open bradyrichmond opened 3 months ago

bradyrichmond commented 3 months ago

Environment information

System:
  OS: Linux 6.8 Ubuntu 22.04.4 LTS 22.04.4 LTS (Jammy Jellyfish)
  CPU: (24) x64 AMD Ryzen 9 3900X 12-Core Processor
  Memory: 10.75 GB / 31.25 GB
  Shell: /bin/bash
Binaries:
  Node: 20.15.0 - /usr/local/bin/node
  Yarn: 1.22.19 - /usr/local/bin/yarn
  npm: 10.7.0 - /usr/local/bin/npm
  pnpm: undefined - undefined
NPM Packages:
  @aws-amplify/auth-construct: 1.2.2
  @aws-amplify/backend: 1.1.1
  @aws-amplify/backend-auth: 1.1.2
  @aws-amplify/backend-cli: 1.2.3
  @aws-amplify/backend-data: 1.1.1
  @aws-amplify/backend-deployer: 1.0.5
  @aws-amplify/backend-function: 1.3.2
  @aws-amplify/backend-output-schemas: 1.2.0
  @aws-amplify/backend-output-storage: 1.1.1
  @aws-amplify/backend-secret: 1.0.1
  @aws-amplify/backend-storage: 1.1.1
  @aws-amplify/cli-core: 1.1.2
  @aws-amplify/client-config: 1.2.0
  @aws-amplify/deployed-backend-client: 1.4.0
  @aws-amplify/form-generator: 1.0.1
  @aws-amplify/model-generator: 1.0.4
  @aws-amplify/platform-core: 1.0.6
  @aws-amplify/plugin-types: 1.2.1
  @aws-amplify/sandbox: 1.1.3
  @aws-amplify/schema-generator: 1.2.1
  aws-amplify: 6.5.1
  aws-cdk: 2.152.0
  aws-cdk-lib: 2.152.0
  typescript: 5.5.4
AWS environment variables:
  AWS_STS_REGIONAL_ENDPOINTS = regional
  AWS_NODEJS_CONNECTION_REUSE_ENABLED = 1
  AWS_SDK_LOAD_CONFIG = 1
No CDK environment variables

Description

Right now query strings are generated without an operation name. e.g "query ($id: ID!) {\n getUserProfile(id: $id) {\n id\n email\n firstName\n lastName\n createdAt\n orgId\n updatedAt\n }\n}\n". When using custom resolvers, the requestContext is supposed to include an operationName property. Right now it ends up undefined. I would like to be able to use the operationName property in my custom authorizer.

I dug around for a while and found code in @aws-amplify/data-schema that I believe is responsible for generating the queryStrings that get used my V6Client.

...
const graphQLDocument = `${graphQLOperationType}${
    graphQLArguments
      ? `(${Object.entries(graphQLArguments).map(
          ([fieldName, type]) => `$${fieldName}: ${type}`,
        )})`
      : ''
  } { ${graphQLFieldName}${
    graphQLArguments
      ? `(${Object.keys(graphQLArguments).map(
          (fieldName) => `${fieldName}: $${fieldName}`,
        )})`
      : ''
  } { ${graphQLSelectionSet} } }`;

  return graphQLDocument;

The operation name already exists in the function, although in a slightly different case. I believe that the only change that would need to happen is to create a pascal case of graphQLFieldName and update the graphQLDocument string literal like this:

const graphQLDocument = `${graphQLOperationType} ${graphQLFieldNamePascalCase}${
    graphQLArguments
      ? `(${Object.entries(graphQLArguments).map(
          ([fieldName, type]) => `$${fieldName}: ${type}`,
        )})`
      : ''
  } { ${graphQLFieldName}${
    graphQLArguments
      ? `(${Object.keys(graphQLArguments).map(
          (fieldName) => `${fieldName}: $${fieldName}`,
        )})`
      : ''
  } { ${graphQLSelectionSet} } }`;

  return graphQLDocument;

Right now what I'm doing is just searching the queryString from requestContext for operation names I'm expecting to handle. However, this runs into collisions quite easily. For example, something as simple as a UserProfile with an orgId, and a secondary index of orgId. This will generate listUserProfile, and listUserProfileByOrgId. If I search the queryString for listUserProfile, it would be true for the listUserProfileByOrgId, as well as listUserProfile. I may want to handle those two things differently.

Is there a technical reason that I don't know about that makes this a bigger issue than I think it is?

ykethan commented 3 months ago

Hey,👋 thanks for raising this! I'm going to transfer this over to our API repository for better assistance 🙂

chrisbonifacio commented 3 months ago

Hi @bradyrichmond 👋 thanks for raising this issue. If I understand correctly, you want to be able to use an operation name to allow/deny users to perform an operation in a custom authorizer lambda.

I imagine you're referring to using the deniedFields like so:

deniedFields: [
      `arn:aws:appsync:${process.env.AWS_REGION}:${accountId}:apis/${apiId}/types/Event/fields/comments`,
      `Mutation.createEvent`
    ],

However, the operation name is set in client clode and can include any qraphql queries within it. If you intend to block a user from accessing a query by operation name, wouldn't the user be able to access the same query under a different operation name? The custom authorizer example we have uses the name of the query/mutation to determine access instead. Would that fulfill your use case?

If I'm misunderstanding, please provide a more complete example to explain your use case.

bradyrichmond commented 3 months ago

Thanks for taking a look at this, @chrisbonifacio!

I realize that Amplify doesn’t explicitly support a Multi Tenant SaaS configuration, but that’s what I’m trying to set up. Since the authorization function on the schema uses or for the allow array, I need to use a custom authorizer to make sure that users only perform operations for their organization and that they have been provided permission to do an action within their organization.

Right now, I am just using Roles for this, but have a path forward on more granular permissions control.

So what I want my flow to be, just focusing on the organization + role authentication this:

  1. User wants to create a new user for their organization.

  2. They fill out the new user form, with first name, last name, and email address, and submit the form to my mutation createUser.

  3. Before the mutation is executed, the schema checks my customAuthorizer function.

  4. In the customAuthorizer function, I verify the token, and get the custom user attribute custom:orgId. I use this orgId to make sure that the item, UserProfile in this case, will be created for the correct organization, and not allow any user to create users in other orgs.

  5. The second half of this is the user roles. I want to only allow certain roles to do certain actions, or retrieve certain data. So, in this case, I only want ADMINS and OWNER roles to be able to create users, and to make sure that ADMINS cannot add users as OWNER role.

  6. so, in order to handle these individual actions based on the action, I need a reliable way to detect the action that they are trying to perform. Then eventually build out the deny fields.

The docs reference operationName in the requestContext, but the gql queries generated by generateClient, when executed, do not include an operationName in the requestContext.

Am I missing something for how to authorize individual users for individual actions?

bradyrichmond commented 2 months ago

Just wanted to bump this. Let me know if you need something else from me. :)

chrisbonifacio commented 2 months ago

Hi @bradyrichmond I've marked this as a feature request for the team to consider. I'm still not convinced that this is the appropriate way to determine authorization in a multi-tenant context because operation names are controlled by the client but I'll discuss with the team and provide feedback.

As an example of what I mean:

# Intended query with restricted access
query GetSensitiveUserData {
  getSensitiveUserInfo(userId: "123") {
    id
    socialSecurityNumber
    bankAccountDetails
  }
}

Your custom auth logic would need to prevent queries with arbitrary operation names from clients:

# Malicious query with a misleading operation name
query GetPublicUserProfile {
  getSensitiveUserInfo(userId: "123") {
    id
    socialSecurityNumber
    bankAccountDetails
  }
}
# Another malicious query without an operation name
{
  getSensitiveUserInfo(userId: "123") {
    id
    socialSecurityNumber
    bankAccountDetails
  }
}
bradyrichmond commented 2 months ago

Okay. That does make sense. Query string could be altered on the client side, making it pass my auth checks.

Can deniedFields be used securely for this kind of authorization? There's the reference to Mutation.createEvent in the docs. Would setting something like that prevent access to the createEvent mutation entirely?

So, user makes a request to use createEvent. They don't actually have permission to use this, but the request is coming through anyway. Schema is set to custom authorization, so it goes to my custom authorizer lambda. Token is verified, so isAuthorized: true. Lambda function verifies the token, checks the user group/permissions and builds the deniedFields array based on that. Lambda function returns the object

{
  isAuthorized: true,
  deniedFields: ['Mutation.createEvent']
}

The api gets this auth object and says "This user is authorized, but not to use createEvent" and throws an UnauthorizedException?

chrisbonifacio commented 2 months ago

You can use deniedFields to nullify fields in the response that the user shouldn't have access to.

For instance, maybe a user has access to Query.listUserProfiles but not UserProfile.ssn. You would add UserProfile.ssn to the deniedFields list. It should return null in the response even if a value is returned from the resolver.

I would recommend initializing isAuthorized as false to deny requests by default unless all conditions are met. For multi-tenancy, you can store orgId as a custom attribute on a Cognito user or as a claim on the access token. You can then use it to determine whether or not they can create a record that "belongs" to a specific tenant/org.

example:

exports.handler = async (event) => {
  const { authorizationToken, requestContext } = event;

  try {
    // Verify and decode the JWT (implementation details omitted for brevity)
    const decodedToken = verifyAndDecodeJWT(authorizationToken);
    const { sub: userId, 'custom:orgId': orgId, 'custom:role': role } = decodedToken;

    // Initialize deniedFields array
    let deniedFields = [];

    // Example: Restrict access to sensitive fields for non-admin users
    if (role !== 'ADMIN') {
      deniedFields.push('UserProfile.ssn');
      deniedFields.push('UserProfile.bankAccountDetails');
    }

    // Example: Restrict access to other organizations' data
    if (requestContext?.arguments?.orgId !== orgId) {
      deniedFields.push('Organization');
    }

    return {
      isAuthorized: true,
      resolverContext: {
        userId,
        orgId,
        role,
      },
      deniedFields: deniedFields,
    };
  } catch (error) {
    console.error('Authorization failed:', error);
    return { isAuthorized: false };
  }
};

This is just an example but you can send also extra context to the resolver through the resolverContext to further narrow the data as needed in the resolver.

In the resolver you can perform checks against a user's identity (or "tenant") like so:

import { util, Context } from '@aws-appsync/utils';
import { get } from '@aws-appsync/utils/dynamodb';

export function request(ctx) {
    return get({ key: { ctx.args.id } });
}

export function response(ctx) {
    if (ctx.result.orgId === ctx.orgId) {
        return ctx.result;
    }
    return null;
}
bradyrichmond commented 2 months ago

Okay. Gave that a try. I am able to deny access to Queries and Mutations, but not to individual fields. For example, I have this InventoryItem in my schema:

InventoryItem: a
            .model({
                costPerUnit: a.float().required(),
                leadTime: a.integer().required(),
                markup: a.float().required(),
                partName: a.string().required(),
                organization: a.belongsTo('Organization', 'orgId'),
                orgId: a.id().required(),
                quantity: a.integer(),
                sources: a.hasMany('InventoryItemSource', 'inventoryItemId'),
                unitType: a.string().required(),
            })
            .secondaryIndexes((index) => [index('orgId')]),

I tested denying access to partName by creating a deniedFields array like this:

const response = {
            isAuthorized: true,
            resolverContext: {
                orgId,
            },
            ttlOverride: 10,
            deniedFields: ['InventoryItem.partName'],
        }

I get this message back when I attempt to run listInventoryItemByOrgId:

Not Authorized to access partName on type InventoryItem and items is null.

Screenshot from 2024-09-06 16-12-32

chrisbonifacio commented 2 months ago

Hi @bradyrichmond 👋 can you try making a request with a modified selection set so that it's omitting the partName field and see if that request gets authorized? I think the issue might be that the field can't be requested.

bradyrichmond commented 2 months ago

That does get it authorized, but seems to conflict with the documentation IMO. My thought was that it could take the exact same query from different users, and it would still be authorized, but the value would be wiped out from the response. Is this a bug? Can you verify the expected behavior?