aws-amplify / amplify-js

A declarative JavaScript library for application development using cloud services.
https://docs.amplify.aws/lib/q/platform/js
Apache License 2.0
9.44k stars 2.13k forks source link

How to i tell Datastore to query appsync with a customerId? The Datastore.query only queries the local database. #4957

Closed benjamin79 closed 4 years ago

benjamin79 commented 4 years ago

How to i tell Datastore to query appsync with a customerId? The Datastore.query only queries the local database. I don´t want the whole Database on the client.

Which Category is your question related to? Datastore

Amplify CLI Version 4.13.4 What AWS Services are you utilizing? AppSync

Provide additional details e.g. code snippets

type User @model @key(fields: ["customerId"]) {
  id: ID!
  firstname: String
  lastname: String!
  email: AWSEmail
  customer: Customer
    @connection(fields: ["customerId"], name: "CustomerUsers")
}
export const syncUser = /* GraphQL */ `
  query SyncUser(
    $filter: ModelPersontempFilterInput
    $limit: Int
    $nextToken: String
    $lastSync: AWSTimestamp
  ) {
    syncPersontemps(
      filter: $filter
      limit: $limit
      nextToken: $nextToken
      lastSync: $lastSync
    ) {
      items {
        id
type Customer @model {
  id: ID!
  Users: [User] @connection(name: "CustomerUsers")
  name: String
}

Anything wrong here? Where would i set a customerId which to query? The

manueliglesias commented 4 years ago

Hi @benjamin79

Thanks for trying the DataStore.

As of today, the DataStore syncs the whole DataBase (in practice and by default, only the first 10000 records per model). It is not possible to tweak filters or define a selection set that doesn't get all the available fields on a model.

We'll be tracking this as a feature request:

Ability to specify base and sync queries filters/selection set.

kwoxford commented 4 years ago

This is crucial for anything other than a trivial application. It doesn't look like anyone's assigned to this yet.

undefobj commented 4 years ago

@kwoxford Can you give specifics on your application/use case and perhaps the schema which you are trying to model? As @manueliglesias mentioned we are tracking this as a feature request but having specifics will help design and priority.

kwoxford commented 4 years ago

Hi. I make apps for professional and academic events. Here’s a couple of areas where it affects me:

  1. Users select which event they want to load into the app. There are multiple tables - speaker bios, submitted papers, programme sessions and so on. Loading all data for all events into the app when it first launches would be cumbersome and since there’s a limit on data items it would probably fail.

  2. Users can make notes in the app. For data protection reasons alone I can’t have a situation where each new user gets a copy of every users’ notes, even if they can’t access them. It would still be a massive breach of GDPR.

Basically it all comes down to providing each customer / user with appropriate data rather than all data.

I think it’s a question of making sure that the base query and sync can take predicates.

At the moment hydration is automatic and doesn’t use predicates. Where do the predicates come from and how are they applied? I’d say it needs an initial query that finds the id that the base query will need. Eg customer login leads to customer lookup which gets the customer id that’s used for hydration and sync.

Hope this helps.

Kim

On 15 Mar 2020, at 19:50, Richard Threlkeld notifications@github.com wrote:

 @kwoxford Can you give specifics on your application/use case and perhaps the schema which you are trying to model? As @manueliglesias mentioned we are tracking this as a feature request but having specifics will help design and priority.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

kwoxford commented 4 years ago

To put it at its most basic, most useful applications need some sort of login or customisation or initialisation, and to have the whole backend database load itself before that’s taken place makes life very hard especially given the limited number of rows.

We ideally need manual control of the base query so we can define predicates and then start hydration and sync when the user has identified themselves or other initialisation has happened

Cheers

Kim

On 15 Mar 2020, at 19:50, Richard Threlkeld notifications@github.com wrote:

 @kwoxford Can you give specifics on your application/use case and perhaps the schema which you are trying to model? As @manueliglesias mentioned we are tracking this as a feature request but having specifics will help design and priority.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

undefobj commented 4 years ago

For the GDPR issue we scope thee local storage and allow you to clear data from other users with DataStore.clear(). When we give more controls this would still be recommended for compliance and privacy.

We will be adding this functionality in and are tracking similar request here: https://github.com/aws-amplify/amplify-js/issues/4552 Specifying predicates or named queries in DataStore.configure() is on this list of actions.

I would note that the controls we have today (including delta sync) make most of this an optimization as it's a seamless background process and shouldn't be blocking even for very large data sets. As you play around with DataStore and have more concrete schemas it would be great to see these to help the design and delivery of these features.

cedvdb commented 4 years ago

You want an use case for not syncing the whole database ? This seems like an elementary feature so maybe I don't understand the question.

Here is our use case. People are part of teams, team members can only do CRUD on the products of their team. Team contributer can only read products of their team.

sigitp-git commented 4 years ago

I had similar use case, on a React app, I store Expenses in Amplify DataStore. I have to do Query with "owner" predicates to only show list of expenses created by the owner that logged in.

I use combination of async functions from Cognito Auth.currentUserInfo() and useEffect() React hooks

Here's the code

import { Auth } from 'aws-amplify'

  const [expenses, setExpenses] = useState([])
  const [username, setUsername] = useState('')

  const fetchUserInfo = async () => {
    const userInfo = await Auth.currentUserInfo()
    setUsername(userInfo.username)
  }

  const fetchExpenses = async (usr) => {
    const expensesDS = await DataStore.query(Expense, (e) =>
      e.owner('eq', usr)
    )
    setExpenses(expensesDS)
  }

  // Fetch username from Cognito Auth class
  useEffect(() => {
    fetchUserInfo()
  }, [username])

  // Fetch expenses from DataStore only for the logged-in username, put into expenses state, all local transaction, offline support
  useEffect(() => {
    fetchExpenses(username)
    const subscription = DataStore.observe(Expense).subscribe(() =>
      fetchExpenses(username)
    )
    // Clean up effect when unmount
    return () => {
      subscription.unsubscribe()
    }
  }, [expenses])

My GraphQL schema

Note that the "owner" field defined as String to store Cognito User Pool username

type Expense @model @auth( rules: [ {allow: owner, ownerField: "owner", operations: [create, read, update, delete]} ]) {
  id: ID!
  createdAt: Float!
  description: String!
  amount: Float!
  note: String
  owner: String
}

Let me know if this make sense? Or this solution need some more improvement?

naripok commented 4 years ago

I also had some problems in my app that may be correlated to this issue:

The matter is, at the first visit there is no local data yet as DataStore have not hydrated the IndexedDB, yet the query 'completes' returning an empty Array, making my whole app building fail. My fix is to repeat the query with a wait until the data is available locally and the query completes correctly. Not ideal at all. I don't think I'm doing anything wrong given the simplicity of my process. Maybe somebody can kindly correct me.

sconway commented 4 years ago

@naripok I had this same issue (as well as the original issue @benjamin79 pointed out). My 'solution' for the initial datastore query returning an empty array was to make the query as early as possible in the application's lifecycle, which allowed the local datastore to be hydrated before any of the data was needed by the application. This seems like a pretty poor solution though. I don't understand why there isn't an event, callback, etc. that will allow us to know when there is data in the store.

With regards to the original issue, is there no solution yet for this? Is there a way to add a custom lambda to limit the returned data? It seems like a big data usage issue to return extra data that a user wouldn't need.

hocnx commented 4 years ago

I face same problem. I am building a chat app. Users should only hold the data of channel they joined in the client. However DataStore sync all data to all users. Even the channel is private. Any solutions for this cases? Can I configure AppSync to customize sync logic?

undefobj commented 4 years ago

I face same problem. I am building a chat app. Users should only hold the data of channel they joined in the client. However DataStore sync all data to all users. Even the channel is private. Any solutions for this cases? Can I configure AppSync to customize sync logic?

We're currently working on rolling out the solution for this mentioned in the design above. Stay tuned.

titiloxx commented 4 years ago

Hello, I am facing the same problem as people above and still waiting for a solution

sunilvb commented 4 years ago

Although, @sigitp-git solution may be a feasible workaround, what if the expense item is the 10001 record ? A consumer mobile app with a few 100 active users can easily go past this limit and end-up carrying every users' data but their own, there by rendering the app useless?!

sconway commented 4 years ago

One potential solution would be to pass up a custom ID, Key, User name, etc. as a header when you configure Amplify.

Something like:

        Amplify.configure({
            API: {
                graphql_headers: async () => ({
                    'special-header-id': IdOrUserNameToSortBy,
                    ...other headers go here
                })
            }
        })

        await DataStore.query(YourModel)

then, in your response VTL template, you can filter the results by that header.

This is definitely not an ideal solution, and it seems odd that this is the default behavior for Datastore, but this is the only potential solution that I know of.

iartemiev commented 4 years ago

We just released the Selective Sync feature for DataStore in aws-amplify@3.3.5.

Please see this section of the docs on how to utilize it.

Closing, as this addresses the feature request.

moriax commented 3 years ago

@iartemiev Hi, thanks for the selective sync feature. But this is only handled in frontend? What if a evil user remove the filter.. he will retreive all entries from the table ? I have complex n-m relationships where its complicate to work with the owner property in the model.. I want to add authorization to the Sync resolvers. This should be possible ? And is it possible to pass a parameter to the sync query ? Use case: I want to retrieve only the chat messages which belongs to the chat rooms, i'm assigned to

export const syncMessages = / GraphQL / ` query SyncMessages( $chatId: ID $filter: ModelPersontempFilterInput $limit: Int

Br, moriax

iartemiev commented 3 years ago

@moriax you're correct in that selective sync isn't meant to be a substitute for an authorization mechanism. It provides sync query filter and optimization capabilities. Owner-based authorization is the recommended approach for limiting access to your application's data.

I want to add authorization to the Sync resolvers. This should be possible?

You can overwrite the resolvers and implement custom authorization logic that suits your needs.

And is it possible to pass a parameter to the sync query?

Selective sync allows you to pass a value to the filter parameter in the sync query. It's not possible to pass arbitrary query parameters when using DataStore, but you can do that with the API category.

kheriox-technologies commented 3 years ago

We have a usecase where we store refdata required for our frontend react app in DynamoDb and serve it through Amplify-AppSync model. It was working well when we started. But our DDB item count crossed 10k and we are unable to sync all those items back to the local datastore. We cant use selective sync as all this ref data is used by the frontend. Is there anyway we can sync the whole DDB table to the local datastore ? We also need this as offline access to this ref data is crucial. Any ideas on how we can sync more that 10k items from DDB to local Datastore using amplify-appsync ?

iartemiev commented 3 years ago

@kheriox-technologies DataStore defaults to storing 10k records locally, but you can adjust that number to suit your needs by setting the maxRecordsToSync property. Make sure to call DataStore.configure before you interact with DataStore in your app code.

DataStore.configure({
  maxRecordsToSync: 50000
});
kheriox-technologies commented 3 years ago

Thanks @iartemiev I will try that.

djsjr commented 3 years ago

@iartemiev is this available for Flutter? I do not see documentation on sync expression for flutter.

iartemiev commented 3 years ago

@djsjr, the Amplify Flutter team is currently working on adding support for this feature and will be releasing it in the near future

github-actions[bot] commented 2 years ago

This issue has been automatically locked since there hasn't been any recent activity after it was closed. Please open a new issue for related bugs.

Looking for a help forum? We recommend joining the Amplify Community Discord server *-help channels or Discussions for those types of questions.