Issues with GraphQL list resolvers

cabcookie commented 7 months ago

Environment information

System:
  OS: macOS 14.3.1
  CPU: (10) arm64 Apple M1 Pro
  Memory: 145.97 MB / 32.00 GB
  Shell: /bin/zsh
Binaries:
  Node: 18.19.0 - ~/.nvm/versions/node/v18.19.0/bin/node
  Yarn: undefined - undefined
  npm: 10.2.3 - ~/.nvm/versions/node/v18.19.0/bin/npm
  pnpm: undefined - undefined
NPM Packages:
  @aws-amplify/backend: 0.13.0-beta.3
  @aws-amplify/backend-cli: 0.12.0-beta.3
  aws-amplify: 6.0.27
  aws-cdk: 2.136.0
  aws-cdk-lib: 2.136.0
  typescript: 5.4.4
AWS environment variables:
  AWS_STS_REGIONAL_ENDPOINTS = regional
  AWS_NODEJS_CONNECTION_REUSE_ENABLED = 1
  AWS_SDK_LOAD_CONFIG = 1
No CDK environment variables

Description

I have an issues with my list queries. Some items are just not showing up when I query for them. I use the AppSync query editor and I with these 2 queries I am expecting the same result (just one being an array and one being a single record):

query getMeeting {
  getMeeting(id: "f5852d51-d393-463d-bf11-6653528edc79") {
    id
    context
    meetingOn
    topic
  }
}

query oneMeeting {
  listMeetings(filter: {id: {eq: "f5852d51-d393-463d-bf11-6653528edc79"}}) {
    items {
      id
      context
      meetingOn
      topic
      createdAt
    }
  }
}

However, the listMeetings is returning an empty list. I have this issue with many of my list queries that the expected result is not produced. Interestingly enough, I can't reproduce the issue in my sandbox environment nor in my feature branch environment but only in my main branch. Is there a way to force the re-creation of the resolvers? My guess is the issue is somewhat there.

sundersc commented 7 months ago

@cabcookie - To force resolver regeneration, you can add a dummy model or add an authorization rule to an existing model. Could you provide the definition for Meeting model?

cabcookie commented 7 months ago

Sure, I just deployed a change to the schema. This did not fix the issue of records not appearing. This issue happens across models so I will point out my DayPlan and DayProjectTask models, as they are a bit simpler. I am pointing out an issue with my DayProjectTask model.

With my DayPlan with { id: "0a2538fb-4e95-4fc2-9f90-bdc579f80b87" } I have 3 DayProjectTasks and only one is being shown. Here are the records (from the DynamoDB):

This one is shown (meaning the statement client.models.DayProjectTask.list({filter: { dayPlanProjectTasksId: { eq: dayPlanId }}}) returns this record):

{
  "id": {
    "S": "5d5e91ab-b0e5-46c8-b947-c2add0bfa92c"
  },
  "createdAt": {
    "S": "2024-04-11T14:14:04.099Z"
  },
  "dayPlanProjectTasksId": {
    "S": "0a2538fb-4e95-4fc2-9f90-bdc579f80b87"
  },
  "done": {
    "BOOL": false
  },
  "owner": {
    "S": "1dadd52b-933b-47b1-9bce-6d4d01483e56::1dadd52b-933b-47b1-9bce-6d4d01483e56"
  },
  "projectsDayTasksId": {
    "S": "87ddb9c0-4318-4bce-99bd-07d20f0336c5"
  },
  "task": {
    "S": "..."
  },
  "updatedAt": {
    "S": "2024-04-11T14:14:04.099Z"
  },
  "__typename": {
    "S": "DayProjectTask"
  }
}

And, this one is not shown (meaning the statement client.models.DayProjectTask.list({filter: { dayPlanProjectTasksId: { eq: dayPlanId }}}) does not return this record):

{
  "id": {
    "S": "256088b6-b3c9-4c17-ada7-8815ea89cdc1"
  },
  "createdAt": {
    "S": "2024-04-11T14:14:33.159Z"
  },
  "dayPlanProjectTasksId": {
    "S": "0a2538fb-4e95-4fc2-9f90-bdc579f80b87"
  },
  "done": {
    "BOOL": false
  },
  "owner": {
    "S": "1dadd52b-933b-47b1-9bce-6d4d01483e56::1dadd52b-933b-47b1-9bce-6d4d01483e56"
  },
  "projectsDayTasksId": {
    "S": "f8364cac-632d-42f4-9763-9a0c4ebf6731"
  },
  "task": {
    "S": "..."
  },
  "updatedAt": {
    "S": "2024-04-11T14:14:33.159Z"
  },
  "__typename": {
    "S": "DayProjectTask"
  }
}

This is my simplified schema:

import { type ClientSchema, a, defineData } from "@aws-amplify/backend";

const schema = a.schema({
  Context: a.enum(["family", "hobby", "work"]),
  // ...
  DayPlan: a
    .model({
      owner: a.string().authorization([a.allow.owner().to(["read", "delete"])]),
      day: a.date().required(),
      dayGoal: a.string().required(),
      context: a.ref("Context"),
      done: a.boolean(),
      // ...
      projectTasks: a.hasMany("DayProjectTask"),
      // ...
    })
    .authorization([a.allow.owner()]),
  DayProjectTask: a
    .model({
      owner: a.string().authorization([a.allow.owner().to(["read", "delete"])]),
      task: a.string().required(),
      done: a.boolean(),
      dayPlan: a.belongsTo("DayPlan"),
      projects: a.belongsTo("Projects"),
    })
    .authorization([a.allow.owner()]),
  // ...
});

export type Schema = ClientSchema<typeof schema>;

export const data = defineData({
  schema,
  authorizationModes: {
    defaultAuthorizationMode: "userPool",
  },
});

Now, I fetch my DayPlans like this:

const fetchDayPlans = (context?: Context) => async () => {
  if (!context) return;
  const { data, errors } = await client.models.DayPlan.list({
    filter: { done: { ne: "true" }, context: { eq: context } },
  });
  console.log("fetchDayPlans", { data, errors });
  if (errors) throw errors;
  return data.map(mapDayPlan).sort((a, b) => sortByDate(true)([a.day, b.day]));
};

And, this is how I fetch my DayProjectTasks:

const fetchProjectTasks = (dayPlanId: string) => async () => {
  const { data, errors } = await client.models.DayProjectTask.list({
    filter: { dayPlanProjectTasksId: { eq: dayPlanId } },
  });
  console.log("fetchProjectTasks", { dayPlanId, data, errors });
  if (errors) throw errors;
  return data.map(mapProjectTask);
};

cabcookie commented 7 months ago

Interestingly, if I fetch DayPlan and DayProjectTask with a selectionSet I receive all items.

So, I changed my fetchDayPlans to this:

const fetchDayPlans = (context?: Context) => async () => {
  if (!context) return;
  const { data, errors } = await client.models.DayPlan.list({
    filter: { done: { ne: "true" }, context: { eq: context } },
    selectionSet: dayplanSelectionSet,
  });
  if (errors) throw errors;
  return data.map(mapDayPlan).sort((a, b) => sortByDate(true)([a.day, b.day]));
};

with this selectionSet:

const dayplanSelectionSet = [
  "id",
  "day",
  "dayGoal",
  "context",
  "done",
  "projectTasks.id",
  "projectTasks.task",
  "projectTasks.done",
  "projectTasks.createdAt",
  "projectTasks.projects.id",
  "projectTasks.projects.project",
  "projectTasks.projects.accounts.id",
] as const;

I receive all items that I am expecting.

AnilMaktala commented 7 months ago

Hi @cabcookie, Thank you for the additional information. As we investigate the issue internally, I noticed you are using an older beta version of @aws-amplify/backend. I recommend updating to the latest beta release (currently 0.13.0-beta.16) to see if that resolves the problem.

cabcookie commented 7 months ago

Hey @AnilMaktala I upgraded the backend. I still have the same issue. Is there anything I can do from my side to debug the issue?

biller-aivy commented 7 months ago

So you are doing a scan, so maybe the list is empty, but is there a nextToken in the response?

cabcookie commented 7 months ago

Hey @biller-aivy , yes there is a nextToken. However, it doesn't bring more values. Here is a code snippet I was using:

const getDayPlans = async (setResult: (result: Temp) => void) => {
  const filter = {
    dayPlanProjectTasksId: { eq: "6eb8c284-849a-4754-a009-52d1b61285e6" },
  };
  const { data, nextToken } = await client.models.DayProjectTask.list({
    filter,
  });
  const test = await client.models.DayProjectTask.list({ nextToken, filter });
  console.log("nextToken", test);
  setResult({ data, nextToken });
};

Both data and test miss a specific record.

biller-aivy commented 7 months ago

@cabcookie you should try to call all data when there is a nextTokem. Scans are AFTER the database call. So i am sure, when you call all nextTokens, than you will get all data.

cabcookie commented 7 months ago

Okay, I will try this later today. But, does this mean the resolvers are not making use of DynamoDB filters but filter the data after retrieving it? I would expect when I set a limit to 50, I get 50 records after filtering and not the result of a filter of the first 50 records.

biller-aivy commented 7 months ago

Okay, I will try this later today. But, does this mean the resolvers are not making use of DynamoDB filters but filter the data after retrieving it? I would expect when I set a limit to 50, I get 50 records after filtering and not the result of a filter of the first 50 records.

The filters are set, but DynamoDB still returns an empty array. In the worst case, you run through the entire database if your item is at the very end. You need a secondary index, then the db will already pre-sort. @index Directiv.

biller-aivy commented 7 months ago

Okay, I will try this later today. But, does this mean the resolvers are not making use of DynamoDB filters but filter the data after retrieving it? I would expect when I set a limit to 50, I get 50 records after filtering and not the result of a filter of the first 50 records.

Maybe it would be a nice feature request that we can use a min items field to receive min items count of 50, so the vtl is running as long as we have 50 matching items. @AnilMaktala ?!

cabcookie commented 7 months ago

Another approach might be to create a secondary index automatically when a .belongsTo connection is created between tables. This way the scanning is more efficient. As Amplify tries to abstract away some of the implementation challenges (like creating a DynamoDb table or GraphQL resolvers) why not doing the same for indexing. It is pretty obvious if a link between tables is created, you will query for these exact items and this then results in full-table scans.

palpatim commented 6 months ago

With the general availability release of Gen2, Amplify does exactly this. :) When you model relationships in Gen2, you declare an explicit "reference field" on the Related model to match the primary key of the Primary model. Behind the scenes, Amplify provisions an index on those reference fields for faster lookup. For example, in the schema:

const schema = a
  .schema({
    Primary: a.model({
      content: a.string(),
      relateds: a.hasMany("Related", "primaryId"),
    }),
    Related: a.model({
      content: a.string(),
      primaryId: a.id(),
      primary: a.belongsTo("Primary", "primaryId"),
    }),
  })
  .authorization((allow) => [allow.publicApiKey()]);

Amplify creates a Global Secondary Index named "gsi-Primary.relateds" on the Related.primaryId field:

That behavior extends to composite primary keys as well--in the more complex example below, the GSI will be created with a partition key of primaryTenantId, and a composite sort key of primaryInstanceId#primaryCustomerId:

const schema = a
  .schema({
    Customer: a
      .model({
        tenantId: a.id().required(),
        instanceId: a.id().required(),
        customerId: a.id().required(),
        content: a.string(),
        chatSessions: a.hasMany("ChatSession", [
          "primaryTenantId",
          "primaryInstanceId",
          "primaryCustomerId",
        ]),
      })
      .identifier(["tenantId", "instanceId", "customerId"]),
    ChatSession: a
      .model({
        tenantId: a.id().required(),
        instanceId: a.id().required(),
        chatSessionId: a.id().required(),
        content: a.string(),
        primaryTenantId: a.id().required(),
        primaryInstanceId: a.id().required(),
        primaryCustomerId: a.id().required(),
        primary: a.belongsTo("Customer", [
          "primaryTenantId",
          "primaryInstanceId",
          "primaryCustomerId",
        ]),
      })
      .identifier(["tenantId", "instanceId", "chatSessionId"]),
  })
  .authorization((allow) => [allow.publicApiKey()]);

cabcookie commented 6 months ago

Sounds fantastic. I can see those indexes in my DynamoDB tables. Thanks for the clarification

AnilMaktala commented 6 months ago

hi @cabcookie, can we close this issue now?

cabcookie commented 6 months ago

Yes

github-actions[bot] commented 6 months ago

This issue is now closed. Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one.

aws-amplify / amplify-category-api

Issues with GraphQL list resolvers #2443

Environment information

Description