aws-amplify / amplify-js

A declarative JavaScript library for application development using cloud services.
https://docs.amplify.aws/lib/q/platform/js
Apache License 2.0
9.42k stars 2.12k forks source link

Simple chat app cannot be efficiently implemented using Datastore (ex. whatsapp-like app) #7534

Open mir1198yusuf opened 3 years ago

mir1198yusuf commented 3 years ago

Is your feature request related to a problem? Please describe. Hello, considering below syncExpression

syncExpression(Events, (c) =>
  c.meetingId('eq', 'some-value').createdAt('gt', '2020-11-01')
)

Suppose i have a list displaying Events , by above expression only events created after 2020-11-01 will be stored in client-side database.

Now if user keeps scrolling up and suddenly wants to see an event records before 2020-11-01 or the very first oldest event, that data would not be available in client-side.

So how to inform Datastore to sync these older records with backend and display it, because Datastore.query returns only data from client-side .

Also maxRecordsToSync would not help as we dont know in advance the number of older records.

Describe the solution you'd like Something similar to below: when user scrolls to the end of list, call Datastore.queryMore(Events, newLimit) where newLimit is the number of records to sync more from backend. Since Datastore.observe listens to changes in backend database, a similar method Datastore.observeLocal to listen to changes in local database (caused by syncing more newLimit number of records).

newLimit will increase the maxRecordsToSync as maxRecordsToSync + newLimit for either all tables or specified table Events.

Describe alternatives you've considered I thought of a solution like when user scroll the end of list then execute Datastore.stop, change the value of maxRecordsToSync in Datastore.config which is usually in some starting file closer toindex.js ( using some hooks in case of React) and then again Datastore.start and finally Datastore.query. But this seems too complicated. I have not tried this though

iartemiev commented 3 years ago

@mir1198yusuf you would need to pass a variable instead of a hardcoded value to the predicate in the syncExpression and have the value of the variable change at runtime. See the Reevaluate expressions at runtime section of the docs.

In your case, it would look something like this:

let fromDate = '2020-11-01';

DataStore.configure({
  syncExpressions: [
    syncExpression(Events, (c) =>
      c.meetingId('eq', 'some-value').createdAt('gt', fromDate)
    )
  ]
});

async function changeMeetingSync() {
  fromDate = '2019-11-01';
  await DataStore.stop();
  await DataStore.start();
}
mir1198yusuf commented 3 years ago

okay this is what I mentioned in alternative. I thought this would not be the ideal way. But I think it's good now.

anyways I would call changeMeetingSync from a far away file. So suppose user is scrolling the list, he came to end of list for earlier from date and now I called the change function so it would start syncing with new value. How would I know when the sync is completed so to display it?

iartemiev commented 3 years ago

So suppose user is scrolling the list, he came to end of list for earlier from date and now I called the change function so it would start syncing with new value. How would I know when the sync is completed so to display it?

If you're expecting that to be a common use case, I would not recommend utilizing selective sync for this. Instead, I would leave the default behavior where all of the Meetings get persisted to the local store and then use DataStore.query with pagination to update your UI state.

Since DataStore follows an offline-first model, it should contain all of the data that your users may plausibly need in order to comprehensively interact with your application.

If that ^ approach doesn't work for you, could you please elaborate on your app requirements and use cases in detail?

mir1198yusuf commented 3 years ago

Even in default behaviour, only maxRecordsToSync number of records are stored , right?

iartemiev commented 3 years ago

By default, yes - 10k records per model. But you can increase that to whatever makes sense for your app.

mir1198yusuf commented 3 years ago

can we set it to auto because we don't know the max limit at any point in time... also it might differ for users

mir1198yusuf commented 3 years ago

also tomorrow I will post a use case for you to better understand the problem.

iartemiev commented 3 years ago

can we set it to auto because we don't know the max limit at any point in time... also it might differ for users

There is not an auto option. You would need to set an upper bound.

also tomorrow I will post a use case for you to better understand the problem.

Sounds good!

mir1198yusuf commented 3 years ago

Consider the use case and simplified table structure as below

user table userid -> primary key (cognito sub of user signed with userpool/google) username

group table groupid -> primary key (can use default id field of @model) groupname

groupuser table id -> primary key (can use default id field of @model) userid -> gsi partition key groupid -> gsi sort key

post table postid -> primary key (can use default id field of @model) groupid -> gsi partition key createdAt -> gsi sort key createdBy

Flow -

  1. so when user logs in, i will obtain cognito sub and fetch the user record from user table

  2. query groupuser table with userid on gsi, and display grouplist of user

  3. on tapping any group, query post table with groupid on gsi, and display post list for that group

  4. to display createdBy details for a post, query user table with createdBy (which is userId)

Doubts :

  1. should i store entire user table for operations 1 and 4. I think this won't be efficient and at the same time won't work if user record for a particular query is not present locally due to maxRecordsToSync limitation. So alternative would be to store only those user records that user might have to query - users of the groups which logged in user is part of So what should be syncExpression for this?

  2. syncExpression for groupuser table can be : userId equal to loggedin userId query - usually (consider React), we do Amplify.configure and Datastore.configure in index.js. But we check Auth.currentAuthenticatedUser in App.js or navigator.js so what should syncExpression in Datastore.configure use userId as it is obtained later example - syncExpression(GroupUser, (g) => g.meetingId('eq', ?????) )

  3. we have to only sync those records from group table that user is part of - that is, basically those groupIds obtained in above point so how syncExpression would be based on this. I think this point is related to https://github.com/aws-amplify/amplify-js/issues/7544#issuecomment-758679984

Similar would be syncing posts for only some groups

SyncExpression Feedback - I feel the syncExpression is basically having a very big drawback, syncExpression can only be used when you know the condition values beforehand. dynamic conditioning is only supported at very small level.

Now you know my use case as mentioned above, so kindly go through it and mention should i use syncExpressions (because i could not find any easy way for it) And if I dont use any syncExpressions, all users data would be stored locally , isn't it a security-wise bad practice? SyncExpressions are somewhat useless in one-to-many dependency between tables

Auto maxRecordsTosync - should i create a new feature-request issue for this as this is very much needed feature.

Also should I change the title of issue, I think the issue has diverted from it.

iartemiev commented 3 years ago

userid -> primary key (cognito sub of user signed with userpool/google)

Currently, you cannot supply your own primary keys or manually set the value of the id field when using DataStore. It will generate a UUIDv4 and store it in the id field for you and use that field as the PK. We've gotten multiple customer requests to add the ability to use custom PKs and we intend to offer that in the future. As a workaround, you can specify a GSI with @key for userId.

It sounds to me like you're wanting to implement authorization rules. Selective sync isn't an optimal solution for that. I would look into using https://docs.amplify.aws/lib/datastore/setup-auth-rules/q/platform/js instead.

Just so I understand your access patterns better, are users going to be creating their own accounts and choosing groups? Or will there be an admin assigning users to groups?

mir1198yusuf commented 3 years ago

Thanks for your response.

As a workaround, you can specify a GSI with @key for userId

Yeah, this works for me.

Just so I understand your access patterns better, are users going to be creating their own accounts and choosing groups? Or will there be an admin assigning users to groups?

To explain you better, just consider the use case to be Whatsapp. User will create their own accounts either email/phone by userpool or by google account federated with cognito. so even if user signs in with google, we can have their cognito sub (unique id for each user)

User can create groups, join a group by invitation link, create posts in group. User who created the group (admin) can delete posts of other users as well. Except admin only creator of post can delete it.

Admin can also delete groups (remove Group table record) or remove users from group (remove GroupUser table record) Admin can designate more users as admin too.

It sounds to me like you're wanting to implement authorization rules.

I looked up the authorization rules too. https://docs.amplify.aws/lib/datastore/setup-auth-rules/q/platform/js

See how I decided the authorization. Correct me please if any

First I thought to go with owner based authorization. @auth(rules: [{ allow: owner }])

But this would allow only post owner to perform CRUD on Post table record whereas admin(s) (which are normal user designated as admin in application, not in some Cognito mechanism) should also be able to delete post even if they are not the owner.

But in general any authenticated Cognito users (but designated as admin within application) must be able to perform CRUD on any record within his/her group.

So the authorisation can perform auth at Cognito level , for example, Admin groups in cognito. but I cannot use that group in application. These auth needs to be managed at app database level. Moreover these Cognito admins would not be specific to application group ( Group table records )

I can create Admin groiup/users as moderators in Cognito user pool for some Superusers or Moderators who are not part of inner application groups. They can simply delete any records. But I dont need these users for now. Below would work for them @auth(rules: [{ allow: groups, groups: ["Admin"] }])

So like I said before my app admins are normal Cognito authenticated users designated as admin by mere database record. GroupUser table would contain a new field called isAdmin or Group table would contain field - Admin -array of userId which are designated as admins

So I thought to go with private authorisation https://docs.amplify.aws/lib/datastore/setup-auth-rules/q/platform/js#private-authorization @auth(rules: [{ allow: private }])

all Cognito user can perform CRUD. segregating and assigning admin functions (delete post) would be done at frontend by comparing userId of logged in user with Group or GroupUser table.

Since you are now aware of the flow, please help in selective syncing or alternative.

Thanks .

mir1198yusuf commented 3 years ago

Authorisation rules does not help in above cases when auth is depended on data within application. It helps when auth is depended on logged in user (that is, somewho related to cognito)

When auth is depended on data within application, then we can use a broader class of Authorisation and implemented finer conditioning within application. Correct me if wrong

iartemiev commented 3 years ago

You would likely want to use dynamic group auth, which is supported by AppSync, but not by DataStore at the moment

Selective sync is a filtering mechanism and does not affect authorization in and of itself, i.e., it won’t do anything to actually protect access to your data on the back end. If you don’t implement authorization rules, there will be nothing stopping someone from modifying the GraphQL requests and accessing all of the data at will.

mir1198yusuf commented 3 years ago

The groups mentioned in the link above - editors, BizDevs which groups are they ?

Are they cognito groups ? I think yes

In my application, the group are stored within application and not cognito

If I try to create groups in Cognito it can be two of them - users (all app users ) and admins (admin of application as whole not of each app groups)

Can you explain how can I use dynamic groups in my use case.

I think this use case is a pretty common case and similar to many apps.

If Datastore cannot handle such cases, I think it cannot be used anywhere except for some simple notes / todo app.

Please understand I am not rude or impolite..I am trying just to explain my points

mir1198yusuf commented 3 years ago

Or please mention some auth rules to use according to schema mentioned in some previous comments

iartemiev commented 3 years ago

In my application, the group are stored within application and not cognito

If you need more fine-grained control than what is offered by groups, enforcing such rules is achieved via back-end business logic. It is not something that should be done clientside. Otherwise, any user could access any other user's data.

Back end logic can be added via custom AppSync resolvers. If you want to use DataStore, that's what I would recommend you look into.

AppSync also has Lambda resolvers, but it will be more difficult to get those to work with DataStore.

mir1198yusuf commented 3 years ago

Since we have discussed various points in general and I have queries in many of them.

Let's understand one by one First authorisation -

Backend business logic - I was doing this with rest API model. Top level similar to previous comment, we can add Cognito authentication so only cognito user can access the api. Next on backend side, I implemented business logic.

Example - only app group admin (not related to cognito admin but a normal cognito user )can delete post in his group. First layer - Cognito authorisation Second layer - in Rest api model, API gateway invokes a lambda function where I first fetch the admin of the group and compare the userids. If logged in user is really admin then delete the post else reject the request.

Now for Datastore. - First layer - Cognito private @auth Second layer - Business logic in resolvers

Now time for doubts - This check is at backend. It is written in resolvers right?

  1. can we write such logic like fetching admins from another table , comparing user ids and then deleting the post in another table is possible in vtl resolvers. Can I use lambda function so it is easier to write this logic there. (like Rest API where api gateway calls a lambda for a particular request path)
  2. Suppose point 1 is somehow solved, then if at front-end I do Datastore.delete, will it remove the entry in local database . So the check I wrote in resolvers will it apply to local database CRUD operations. If no, when the resolvers will be triggered - during sync or during performing operations on Appsync backend. If only on Appsync backend, then what's the use of writing logic in resolvers, for the user it will show the post deleted later when he comes back he will find not deleted. Am I missing something? How resolvers enforce their logic on local database operations?
mir1198yusuf commented 3 years ago

Answer to point 2 obtained from discord - I dont think custom resolvers on the appsync side is validated by DataStore locally, @ Ivan please correct me if Im wrong I assume the record be deleted locally first then behind the scene DataStore will try to soft delete the record thru AppSync, if it fails that record should be restored

is this true ?

if yes then private cognito only @auth + condition at front-end + custom logic in resolvers would solve authorisation and no one can modify other records which they shouldn't be.

mir1198yusuf commented 3 years ago

@iartemiev let's forget everything from discord or above comments. it's getting complicated.

Just provide a documentation or example app design (only Datastore related thing) for very basic whatsapp-like chat app with just 3 entities - User, Group and Post

(You don't have to design it completely by yourself. We can discuss each point and design iteratively)

Criteria -

  1. User will join by cognito userpool
  2. User can create group (so they become admin of that group) or join other group
  3. All Users of a group can post messages in group
  4. Group admin can delete any one's post. If any one else than admin deletes it even from local,it should not get deleted at backend.
  5. there can be more than one admins. Existing admin can assign other members of group as admins.
  6. Admin of one group might be normal user (non admin ) of other group.so he should be able to perform admin function (deleting anyone's Post ) only in groups where he is admin.
  7. Locally, user should have only data of groups he is in and only posts of these groups. Storing other users group or post is security-wise bad practice. User should not be able to query other user data by modifying request from front-end
  8. Users can see profile of other users who share atleast one mutual group. Suppose User A and User B dont have any mutual group, they cannot see each other's profile.
  9. User record have a unique field (other than default id field) called as userId. This is basically the sub obtained from Cognito.
  10. Users might belong to some groups and might not belong to others
  11. Admin of group can delete/remove any user from group. Once user is removed from group, he should be able to send or receive new message to or from that group. So syncing should be near real-time.

Access patterns which will be frequently used :-

  1. get single user record
  2. get single group record
  3. get groups belonging to a particular user
  4. get posts belonging to a particular group
  5. get single post record
  6. get users belonging to a particular group

Design Schemas : - I have created two design schemas for the use case - Relational design and Single table design

Relational schema - User table : id - default PK userId - sub of Cognito - PK of GSI name

Group table : id - default PK - can use this as groupid groupname

GroupUsers table : id - default PK groupId - PK of GSI-A - SK of GSI-B userId - SK of GSI-A - PK of GSI-B

Post table : id - default PK - can use as postId groupId - id of group that the post is created in - PK of GSI createdAt - SK of GSI text - content of post

Single table design - id - default PK - let's forget this we are keeping but not using it Also,we are using two generic keys - key1 and key2 - two GSI are created on them GSI-A and GSI-B

key1 - PK of GSI-A - SK of GSI-B key2 - PK of GSI-B - SK of GSI-A

For storing user record ; key1 - userId - cognito sub key2 - name (here sort key is not needed but we are storing just to fill value and avoid extra field for name)

For storing group record : key1 - groupId key2 - group name (here sort key is not needed but we are storing just to fill value and avoid extra field for name)

For storing group-users mapping - Remember Groups and Users have m-m relationship key1 - userId key2 - groupId

For storing group-post mapping - Remember groups and post have 1-m relationship key1 - groupId key2 - postId

We will prefix type name before value like User#1233, Group#4344

consider below data according to above mapping - 2 users, 1 group and 2 posts key1 | key2 user#1 | john group#1 | developers user#1 | group#1 group#1 | post#1 group#1 | post#2 user#2 | smith user#2 | group#1

Let's see how access pattern can work on these 2 GSIs -

  1. get single user record - GSI-A : PK as user#1
  2. get single group record - GSI-A : PK as group#1
  3. get groups belonging to a particular user - GSI-A : PK as user#1 and SK starting with group#
  4. get posts belonging to a particular group - GSI-A : PK as group#1 and SK starting with post#
  5. get single post record - GSI-B : PK as post#1
  6. get users belonging to a particular group - GSI-B : PK as group#1 and SK starting with user#

@iartemiev , cc - @nubpro , @dabit3

Now please specify to satisfy the mentioned criteria - what syncExpression should be (if to use ) or what @ auth directive should be use to protect user data and sync only required data. also where should custom resolvers defined and when it will be enforced ( at syncing time ?)

Use any schema and see if we can implement this using datastore.

If we are able to solve this using Datastore, then it should be added to documentation instead of that very basic notes app.

There is one chat app example on internet but it is a single room or we can say single group chat where all users belong to same group . But in case of whatsapp users might belong to some groups and might not belong to others.

If this is solvable by Datastore, most real world app would be solved. If it fails then I think I have to opt out of Datastore and it means Datastore is still unable to handle such common use cases.

We can deisgn iteratively by discussing on discord .

iartemiev commented 3 years ago

@mir1198yusuf I spent some time with this over the weekend and my conclusion is that unfortunately at the moment, we can't utilize either of these approaches with DataStore.

The relational design won't work because even though we can configure AppSync resolvers to fetch all the correct data for the base and delta sync queries by leveraging pipeline resolvers, we won't be able to get the desired result out of the subscriptions that DataStore uses after the initial sync to keep the local store up to date. The reason for this is that request/response resolvers are only applied to AppSync subscriptions at connect time, there's no way to apply them to each message, which precludes us from being able to use dynamic authorization rules as are required here (i.e., a changing list of authorized users). In other words, subscriptions would be sent either to all users (which is not acceptable due to privacy concerns) or only to the user creating the messages, which defeats the purpose of a messaging app.

As far as the single table design, without field-level authorization in DataStore this won't be viable either, as there will be no mechanism to prevent a user from modifying certain fields (e.g., key1 and key2 in your example schema) to escalate their privileges and give themselves admin roles in other groups, or access messages they shouldn't be authorized to read.

The team will work on identifying which features we'll need to add to DataStore in the near future in order to enable this use case.

mir1198yusuf commented 3 years ago

@iartemiev -

As far as the single table design, without field-level authorization in DataStore this won't be viable either, as there will be no mechanism to prevent a user from modifying certain fields

I understood this part. Just wanted to confirm that field level auth is present in appsync but not in Datastore. ?

there's no way to apply them to each message, which precludes us from being able to use dynamic authorization rules as are required here (i.e., a changing list of authorized users)

I have a confusion here. By dynamic auth , do you mean dynamic group authorisation (https://docs.amplify.aws/cli/graphql-transformer/auth#dynamic-group-authorization). But this is backed by Cognito groups. We don't want to use cognito groups.

Instead we can use owner authorisation with a field having multiple values (arrays).

Please have a look at this @dabit3 post (https://github.com/dabit3/graphql-recipes#whatsapp-clone) . Here Message represent Post in our case and Conversation represents Group.

Now the main focus point is that the @ auth of Conversation table . The owner auth is used and ownerField is set to an array type field members. So in this way we can avoid Cognito-backed dynamic auth and use owner-auth.

Second point of focus - if we are able to use this (ie. set proper owner based auth ) then according to this issue (https://github.com/aws-amplify/amplify-android/issues/1008) ,even if we sync entire table at front-end by using no syncExpression, the records will be protected by auth. Only those records belonging or authorised to logged in user will be synced which serve our purpose. While mutation also it should check and prevent unauthorised owner to make mutations (due to @ auth)

This is the just the high-level discussion about the thing I found by linking several issues. Not sure if this gonna fulfill all our criterias and access patterns. I am yet to test this on my machine.

Btw does this make sense in our case or I am misunderstanding your last comment?

iartemiev commented 3 years ago

I understood this part. Just wanted to confirm that field level auth is present in appsync but not in Datastore. ?

It doesn't work with DataStore at all at the moment, but when using AppSync there is a limitation around subscriptions. Namely, values for any field protected with field-level auth is going to return null in the subscription message.

have a confusion here. By dynamic auth , do you mean dynamic group authorisation

No, I just mean authorization rules that aren't known in advance. This includes the array of owners rule, which doesn't work with DataStore either also due to AppSync subscription limitations.

mir1198yusuf commented 3 years ago

No, I just mean authorization rules that aren't known in advance. This includes the array of owners rule

I think the Authorization rules are simple. Like if you see in @dabit3 schema link above, we can have a new field like members and use use owner auth on that members field. So the users present in that array would sync those groups. Am I missing something?

mir1198yusuf commented 3 years ago

And another field as admins, and allow delete for admins field .

iartemiev commented 3 years ago

To illustrate what I mean re: subscription limitations with an array of owners:

When using AppSync subscriptions, the selection set and any subscription arguments have to match the selection set and value for that subscription argument in a mutation in order to for the subscription to receive a message.

E.g., if you establish the subscription with

onUpdateMessage("bob") {
    id
    content
    _version
    _deleted
    _lastChangedAt
    createdAt
    updatedAt
}

Only a mutation that has the same selection set and owner === 'bob' will result in the subscription receiving a message.

When using an array of owners, the value of the owner field is always changing, as users get added/removed.

So at connect time we may have

onUpdateMessage(["bob", "allice"]) {
    id
    content
    _version
    _deleted
    _lastChangedAt
    createdAt
    updatedAt
}

However, once we add or remove a user to this array, the subscription will no longer receive messages. Since DataStore depends on subscriptions in order to keep the local store up to date, this prevents this auth rule from being usable with DataStore at the moment.

mir1198yusuf commented 3 years ago

Oh , got it. So the comparison is exact comparison. Unless it is Bob again, we won't receive subscription. Correct?

iartemiev commented 3 years ago

That is correct.

mir1198yusuf commented 3 years ago

Any reason or scenario , why it was implemented this way? Someone would use ownerField as string for exact comparison and array type only for if-found comparison.

This is limitation of appsync or Datastore?

update- this is appsync limitation or desired behaviour of @auth directive as mentioned below

mir1198yusuf commented 3 years ago

https://github.com/aws-amplify/amplify-cli/issues/1043

here it is documented about exact comparison. see proposal 3

iartemiev commented 3 years ago

Yes, this is an AppSync constraint. My guess is that it's done out of a performance consideration. Given that AppSync can be used at extreme scale, iterating through an array for each subscription message is notably more resource-intensive than doing an equality check i.e., O(n) vs O(1).

mir1198yusuf commented 3 years ago

Yes this might be the reason. Btw , is this what is called as multi-tenant app? even @nubpro was asking about it.

iartemiev commented 3 years ago

Yes, this would be an example of a multi-tenant app

mir1198yusuf commented 3 years ago

@iartemiev - Any rough estimate when datastore would be able to fully support this use case ? So I can plan my project accordingly. Also I see multi-tenancy questions in many past issues.

iyz91 commented 3 years ago

Yes, this is an AppSync constraint. My guess is that it's done out of a performance consideration. Given that AppSync can be used at extreme scale, iterating through an array for each subscription message is notably more resource-intensive than doing an equality check i.e., O(n) vs O(1).

@iartemiev Isn't an AppSync subscription generally designed to only check auth at connection time, not on each message (and thus a single O(n) call relatively infrequently? Is this not supported by DataStore? Additional ref: https://docs.aws.amazon.com/appsync/latest/devguide/security-authorization-use-cases.html#security-real-time-data

mir1198yusuf commented 3 years ago

@iartemiev can answer that , I think , but also auth is checked at sync (base or delta), so it will o(n) for each message or mutation

iyz91 commented 3 years ago

Makes sense, but if doesn't facilitate that, e.g. user membership in some shared entity such as members of a project, then DataStore isn't all that useful. The docs are also misleading, seems more promising than it actually is (as evidenced by a lot of other issues).

Simply put, if DataStore could facilitate these multi-tenancy designs, such as this thread example of a WhatsApp type app, or a project management app with privileged users across varying projects then it would be a great framework.

@iartemiev Can you confirm clearly on the following:

mir1198yusuf commented 3 years ago

I agree with you . Even I felt that Datastore seems more promising than it actually is. The response from Ivan in discord was - they don't have a public roadmap or estimate for this and Datastore cannot handle such cases for now.

If the owner array rule with if-found-in-array is implemented atleast, to some extent Datastore would be useful (though there are some other issues as well). Without this feature it is totally not suitable for real-world apps.

I myself have to shift to Appsync for now (without offline features) and wait for Datastore to be ready for real-world.

iyz91 commented 3 years ago

@mir1198yusuf Thank you very much for that info and for following up on it yourself. They would save a lot of pain if they just stated this simply in their docs. I'll let you know if I come across some intermediate solution or alternative for our similar use cases. Good luck with your work.

iartemiev commented 3 years ago

Isn't an AppSync subscription generally designed to only check auth at connection time, not on each message (and thus a single O(n) call relatively infrequently? Is this not supported by DataStore?

@iyz91, AppSync only evaluates auth for subscriptions at connect time, which makes using an array of owners for the auth rule (like what @mir1198yusuf was exploring as a potential solution) incompatible with AppSync subscriptions. In order to use an array of owners, you would need to be able to check the current user against the changing list of owners for each message.

In other words, AppSync subscriptions will not work as expected for the following schema:

type Model @model @auth(rules: [allow: owner, ownerField: "admins"]) 
{
  admins: [String]
}

What I was talking about with O(1) vs. O(n) was just comparing a single equality check vs iterating through an array, but that's kind of beside the point. The crux of the issue is evaluating auth rules for each message.

Some of the other proposed solutions (like the one you referenced) may or may not offer appropriate authorization (that depends on the exact use case). But regardless, their implementation hinges on customizing GraphQL subscriptions. Currently, DataStore generates the subscriptions in the library when it starts and it's not possible to override these/supply your own. So if the schema you want to implement requires you to specify your own subscriptions, it will not be possible to use such a schema with DataStore at the moment.

The changes we'll need to make are quite complex and require a deep dive from the team. It will take some time to do this right and to enable these use cases. In the meantime, please keep sharing feedback with us about the kinds of applications you would like to build. We take all of this into account as we strive to improve DataStore for all of our customers.

iyz91 commented 3 years ago

@iartemiev Thank you for the clarification and detailed response. I'm sure this is very complex and I wish you and your team the best of luck in enabling these use cases. Going forward:

mir1198yusuf commented 3 years ago

Posting it here as I did not get any reply after asking thrice on discord -

what I was planning to do in general was -

assume single table design.

in datastore.configure, for syncExpression we will use a promise.

in that promise, we will fetch the current user, and all groups that user belongs to .

so an array consisting of userid and all groupids will be created.

since both userid and groupid is stored in PK1 field only, we will sync only those records where PK2 equals to elements in array

so in this way only posts for user groups are synced.

2nd point - for mutation, assume evil user can do anything at front-end (can also delete or create post in another group by using different groupid for which he is not part of )

so post is created locally. now while backend syncing, the lambda resolver will check if the user who called the graphql mutation and get the sub of that user. in lambda resolver it will then fetch all users (for creation of post ) and all admins (for deletion of post) and if user is present in that list, allow the mutation or else reject.

for query also we will first check if user is part of group in backend if yes then allow query. also by Datastore.configure syncExpression it won't sync for posts of unauthorised groups.

3rd point - subscription. suppose user is on group posts screen, we will use Datastore.observe (Table, groupid) so any new posts for that group will be observed.

Now 2 questions -

  1. can evil user do Datastore.observe (Table, unauthorised_groupid) ? if he is able , will he get that posts also

  2. if the above method possible if we are to somehow figure out how to do lambda resolver ?

iartemiev commented 3 years ago

I'm having trouble fully following this. Can you please provide a sample GraphQL schema for the data model you're proposing?

As I've stated before, DataStore uses subscriptions behind the scenes to keep the store up to date. It's not possible to customize these subscriptions currently, so I don't see how something like this could work with DataStore. DataStore.observe is something different altogether, it's a mechanism for notifying the UI of changes, but it doesn't create the actual underlying AppSync subscriptions. Those are generated and handled internally by DataStore.

Any time you're dealing with a changing array of users/groups - subscription auth is not going to work with that at the moment, due to the AppSync limitation we've discussed above.

mir1198yusuf commented 3 years ago

@iartemiev - I tried several days trying to come up with something but I think irrespective of any approach, Datastore has a block in each of it. So I think , I have no choice but to wait for the new features/announcements from Datastore.

Ironically, even AppSync would not be sufficient to implement the whatsapp schema which I mentioned above. (This means @dabit3 's whatsapp schema would not work as it uses owner-group auth of members field which does not work for reasons you pointed out before).

The only way AppSync would work now is by using @function directives for each query/mutation operations. (that is using lambda resolvers,, also I am ok using lambda resolvers (as it abstracts the VTL as in this post https://www.timveletta.com/blog/2020-12-08-direct-lambda-resolvers-with-aws-amplify/ ) and not direct lambda resolver which completely removes the VTL ). The former indirect lambda resolvers is much easy to configure as compared to direct lambda resolver.

mir1198yusuf commented 3 years ago

Any updates on this issue ? especially offline feature...

danrivett commented 3 years ago

For reference I also ran into similar problems with multi-owner access and wrote it up previously in #7989 as didn't come across this issue when searching.

My ticket doesn't contain anything in addition to what Ivan has mentioned here, but it's heartening for me at least to see that the multi-owner use case seems to be quite a common requirement. Our upcoming roadmap over the next 3 months relies pretty heavily on it - for team-based workflows, messaging between individuals (which currently requires us to duplicate message records one for the author and one per recipient) etc.

Supporting team-based ownership/access rules is my number one request from DataStore, so would love to see it added.

iartemiev commented 3 years ago

@danrivett, @mir1198yusuf - a quick update on this: the AppSync team is actively working on a solution on their end that would (among other things) enable this use case. Once they have completed and released this feature, the Amplify client-side libraries will work on utilizing it on our end, specifically to add the functionality requested in this issue.

It's prioritized on both the service and client side, but it is a complex feature that is still early in the process.

danrivett commented 3 years ago

That's fantastic @iartemiev. That's really encouraging, thank you for the update!

I can fully understand it may take a while for this to get implemented (likely on the order of several months just for AppSync functionality to be implemented, and then the same again for DataStore if none of it can be done in parallel) but the fact this use case is understood, prioritized for inclusion, and is starting to be worked on is greatly encouraging because I was getting quite concerned we'd hit quite a big road block and we'd have to manage in the long run all the complexity of replicating and keeping in sync multiple physical records for each logical record in order for each person to have appropriate data access without letting everyone else have access.

Thanks for the update Ivan, made for a great start to the week.

iartemiev commented 3 years ago

No problem, Dan! We'll post an update here as soon as the AppSync feature in question gets released

zirkelc commented 3 years ago

I haven't ran into this issue yet, but I'm following it because I got similar requirements.

Is the problem with multi-owner access only concerning the DataStore on the client side, or also the whole GraphQL API deployed by the CLI?

danrivett commented 3 years ago

@chriszirkel it's both, because AppSync itself doesn't allow you to subscribe to updates if the owner field on the type is an array of permitted owners, only a single owner id. When using an owners array field, currently you either have to subscribe without specifying any owner value (and so receive updates for every item, even ones not permitted to read), or subscribe and specify an exact match of the whole owners field array.

What we want to be able to do is subscribe specifying the current user and AppSync returns updates for any record that includes the current user id in the owner field array. This requires a change to AppSync itself to support this. After that, DataStore then needs to be updated to make use of this change.

Hope that helps.