DataStore: support for multi-tenant apps with sharing?

tslocke commented 4 years ago

As of today, the DataStore syncs the whole DataBase

(From https://github.com/aws-amplify/amplify-js/issues/4957#issuecomment-589760076)

Could I suggest this information be presented front and centre at the start of the docs? This is fundamental to understanding what the offering is here. I stumbled across the above issue by accident before I invested time learning Amplify DataStore and I'm very glad I did!

This seems like a significant departure from the architecture of a typical "data in the could" app, and I'm struggling to understand how it can be practical for many kinds of apps.

To take a simple example -- a multi-tenant "to-do list" app -- I would normally expect to have a single database on the server. There would be user-ids on the to-dos or the lists. Obviously having each web/mobile client sync everyone else's to-dos doesn't make sense.

Maybe this would be handled via auth rules? Maybe I create isolated databases for each user? I haven't found either in the docs so far.

Both of those break down in the case of sharing. Maybe I want to delegate to-dos to other users. Maybe it's a tumblr like app, where everything is readable by everyone. Clearly, I still don't want each client to sync everyone's posts.

Am I missing something?

Thanks!

kwoxford commented 4 years ago

I raised this a few days ago. Like you I've looked at using both auth rules and isolated databases but they don't really work in any non-trivial use case.

Despite notes and todos being used as examples in the docs, in any real-world application there'd be a huge breach of privacy laws. And if there are lots of users syncing could become unwieldy.

The problem is conceptual: individual databases are stored in users' apps but they're all synced to the same back end. There has to be some way of keeping them separate.

The simple solution is:

Allow the use of predicates in the base and sync queries (e.g. in DataStore.config) so individual datasets can be kept apart from each other.
Manually start syncing only after the predicates have been set e.g. the user has logged in and their ID is known (DataStore.start())

I've had a look through the code and these two things seem fairly doable.

tslocke commented 4 years ago

The solution proposed by @kwoxford would help in some situations, but still leaves a lot lacking for a proper client/server data architecture.

Consider a Slack like app - is it going sync all historical data forever for every channel I have access to? If not, does that mean I can't search back through history?

Consider a social network like app that fills up with lots of content over time. There would be a huge amount of data I am not allowed to view, a huge amount that I am allowed to view but definitely don't want to sync (but do want to search), and a comparatively tiny amount that I actually want to look at, and all with complex rules and relationships defining what's what.

I've picked a difficult example, but I think almost all cloud based apps look like this to some degree or another.

For this to become a generally useful data store, I can't see a way round supporting proper server-side querying, and only syncing data returned by queries.

undefobj commented 4 years ago

@tslocke we're going to be picking up work in the next couple sprints on designs for letting you specify more @key capabilities on the base and delta queries, as well as filters and sorting for situations like this. If you could let us know the details of what your ideal data model would look like (e.g. GraphQL schema) around this and maybe app code it would help us to ensure that we're designing to fit your needs appropriately.

mdebo commented 4 years ago

@tslocke i agee with you i don'ht think eager loading all datas is a the right way to do it even in the case of a simple app (not a multi tenant one) For me, a good strategy would be to do something more closer to the cache-and-network apollo cache strategy . We could return result from datastore (so local database first) to be as fastest as possible, and ant the same time query network to update local items. No more needs to populate local databases by doing syncing of all datas, just focus on syncing datas for which the user do queries. Also, doing syncing without using indexes defined by the user using @key directives would become really expensive in scan operations. It's another reason for which i think focus on queries done by the user is the right way. Datastore will be populated by datas returned by queries actually done by the user, and so using indexes defined by himself (no scan operations)

undefobj commented 4 years ago

@mdebo The different network strategies are unnecessary in DataStore and a mental overhead for developers, the same behavior is automatically in the sync process by hitting the local models first and GraphQL subscriptions update them immediately in the background. The effect is the same you just don't need to think about it.

To be clear DataStore does not just pull down "the whole database" even today, you can configure max rows for the base table sync. Performing @key operations on GSIs with filters is not the same as doing a scan, it is essentially a scope to specific model types with DynamoDB queries which has the net effect of "queries done by the user" from the local device.

We have this on our roadmap for upcoming sprints, so let's focus on use cases not implementation. If you give us your data model and application use cases (more example code/schema the better) we can ensure that we're meeting your needs in design.

tslocke commented 4 years ago

@undefobj replying to your comment from a couple of weeks ago, asking what my data model looks like, I'm not building anything with DataStore right now, so I can't give you anything specific.

I'll probably check back in a few months time to see what capabilities you've added. It's a really nice vision for a high level DB API, but right now it seems like it's only suitable for single user apps.

rpostulart commented 4 years ago

Ok, I am looking forward for this feature, here you can find the data model and use case I am using: https://github.com/aws-amplify/amplify-js/issues/5222#issuecomment-609598129

karrettgelley commented 4 years ago

@tslocke @kwoxford I second yalls thoughts. If I'm going to use DataStore, I can't be syncing the entire db. That sounds really expensive. I really just need to cache queried data. Firebase approaches it this way and I hope amplify goes is a similar way. Maybe I'm wrong and just don't understand DataStore, but what I really want is the API module with offline capabilities

kwoxford commented 4 years ago

Since Datastore is supposed to be client-first, caching queried data client side is back to front. The issue surely is that when you have multiple copies of an app each with different user data you don’t want every app to have a copy of the entire dataset.

undefobj commented 4 years ago

Hi as noted in the above comment this is on our upcoming roadmap to address, and we just had an internal design session last week. We're starting work in the upcoming sprints to deliver the feature.

Caching is actually not the correct way to address this request, and DataStore doesn't use a "cache". Caches are not databases/datastores. They are fraught with issues around referential integrity and offline updates. Our offline support is more holistic and we do not want customers to have to deal with complexities related to caches.

The request in this issue is related to controlling the base sync to target finer grained subsets of your data by applying indexes, filters, and sorting in conjunction with the existing functionality of controlling the max records to sync.

API category having offline features is something we're looking at later in the year. It will require design work as we'd don't want to bring all the downsides of caching as outlined above.

jesper-bylund commented 4 years ago

Well wow.. just wow. If I had known this, I would not have invested any time into the platform. This means that my project, several months into it, is just junk..

This is very badly communicated. Thank you @tslocke for highlighting this.

tslocke commented 4 years ago

@raelmiu you can look into converting your DataStore models into a client-side API on top of something like Apollo Client. Backend can come from e.g. Hasura. It's nowhere near as effortless as the DataStore vision, but it's pretty good. You have to think about the cache, when to re-fetch, optimistic updates etc, but hopefully you'll be able to work with large parts of your existing code unchanged.

@undefobj "the request in this issue is related to controlling the base sync to target finer grained subsets of your data". I don't think that captures the problem here. There are many apps where the subset of data that could potentially be needed on the client can be vastly bigger than what can realistically be synced. The data needed is a dynamic issue, determined by what the user does in the app.

I hope you can solve these problems and I'm definitely going to keep an eye on the project, but I don't see a proposed solution yet.

jesper-bylund commented 4 years ago

@tslocke thanks, that's what I'm looking into doing now. DataStore seems to have added crud to my schema though. Have you had any suck issues?

tslocke commented 4 years ago

I came at it the other way—I'd already done a fair amount of work on an Apollo/Hasura stack, and then discovered DataStore. I spent an afternoon looking at DataStore thinking dammit maybe I should have gone with this, and then came across this issue.

If you go with Hasura, I don't think it can import a GraphQL schema anyway. You define your database using their tools (migrations or a browser based interactive tool) and the GraphQL schema is generated.

rpostulart commented 4 years ago

Is there an outline if multitenant support is taking weeks or months? This makes my decision for my next app easier :)

ltaljaard commented 4 years ago

Is there an outline if multitenant support is taking weeks or months? This makes my decision for my next app easier :)

Yes please, this feature should be at the very top of the feature request list in my opinion as the lack of selective syncing to the device makes the entire solution pretty much unusable for anything other than a demo app.

manueliglesias commented 4 years ago

Hi everyone, wanted to give a bit of an update.

This is the highest issue on our backlog as @undefobj noted which we’re working towards. We did some initial design work which should address the requirements, including dynamically filtering the sync by what the user does in the app. There were a few other things that we uncovered while looking at this design that we had to do internally to the DataStore code (such as addressing Sync Status Notification, progressive data fetching/rendering, and performance optimizations) which we’re finishing up and hope to release in the next couple weeks. As that releases we’ll be able to start implementing this feature. The design we’re looking to implement is outlined below, please let us know if this wouldn’t meet your use case.

The idea is to expose a key in the configuration where you can specify on a per-model basis which filters should be sent when synching with the backend. These filter would be provided as predicates (the same kind you use to pass conditions when querying the DataStore) or as a function that returns a promise that resolves to a predicate (to allow async flows like getting the user identity using the Auth category).

The usage would be along these lines:

//Standard behavior today 
DataStore.config({});      //Runs syncPosts with a scan as base sync

//Adding a filter
DataStore.config({
  syncFilter: { Post: c => c.rating("gt", 4) }   //Runs syncPosts with a filter as base sync
});

//Adding a filter for more than one model
DataStore.config({
  syncFilter: {
    //Runs syncPosts with a dynamic filter
    Post: async () => {
      const owner = (await Auth.currentAuthenticatedUser()).username;

      return c => c.userId("eq", owner);
    },
    Comment: c => c.createdAt("ge", Date.now() - (1000 * 60 * 24 * 15)) // get comments created in the last 15 days
});

Let us know your thoughts on this, hopefully it covers a good amount of the use cases mentioned so far.

rpostulart commented 4 years ago

This is exactly what we need. This would be great! Really appreciated what you and the team are doing. I don't want to pressure it, but it is good to understand that this is coming soon, that makes the architectural decisions easier.

ltaljaard commented 4 years ago

@manueliglesias I think our use case would be quite common. We have a hierarchical data structure (top to bottom, widening at the bottom) where each device need to receive a certain part of the tree. There are multiple object types (models/tables) that need to be synced down that are all related to each other through foreign keys (there are 1 to many and many to many relationships). Example: Org unit 1 contains Employees 1, 2, 3 as well as Org Unit 2; Org Unit 2 contains employees 4, 5. When a user logs on who is linked to Employee 2, that user should receive the entire Org Unit 1 (including employees 1, 2, 3, Org Unit 2, employees 4 and 5). A user linked to employee 4 should only receive Org Unit 2 and employees 4 and 5 on their device. So basically it should be possible to specify which part of the org structure should be received given a certain entry point into the tree if you like. This can be achieved through GraphQL queries with kind of join conditions, so as long as it's possible to have full control of such queries this kind of use case should work out fine.

chernando commented 4 years ago

I've been trying to get through this issue and the resolver for sync${pluralTypeName} is the real stopper.

You can see the test project I've made here https://github.com/chernando/broken-amplify-datastore

A simple model like:

type PrivateNote
  @model
  @auth(rules: [{allow: owner}])
{
  id: ID!
  content: String!
}

generates a Query.listPrivateNotes.req.vtl that always uses Scan on the PrivateNote Table.

You cannot change primary index, DataStore needs id: ID!. But you can add a secondary index changing the schema to:

type PrivateNote
  @model
  @auth(rules: [{allow: owner}])
  @key(
    name: "byOwner",
    fields: ["owner"]
  )
{
  id: ID!
  content: String!
  owner: String!
}

So you can override Query.listPrivateNotes.req.vtl to use Query on the second index byOwner like this https://github.com/chernando/broken-amplify-datastore/blob/master/amplify/backend/api/reactamplified/resolvers/Query.listPrivateNotes.req.vtl . And it works.

But DataStore uses syncPrivateNotes and its resolver is:

{
  "version": "2018-05-29",
  "operation": "Sync",
  "limit": $util.defaultIfNull($ctx.args.limit, 100),
  "nextToken": $util.toJson($util.defaultIfNull($ctx.args.nextToken, null)),
  "lastSync": $util.toJson($util.defaultIfNull($ctx.args.lastSync, null)),
  "filter":   #if( $context.args.filter )
$util.transform.toDynamoDBFilterExpression($ctx.args.filter)
  #else
null
  #end
}

As documented in https://docs.aws.amazon.com/appsync/latest/devguide/resolver-mapping-template-reference-dynamodb.html#aws-appsync-resolver-mapping-template-reference-dynamodb-sync "Sync" operation only supports a filter.

So let's use this filter overriding the resolver with this https://github.com/chernando/broken-amplify-datastore/blob/master/amplify/backend/api/reactamplified/resolvers/Query.syncPrivateNotes.req.vtl

But Sync is still a Scan:

capture

Which means as table grows we'll pay more and more read units each sync.

So, I don't see how https://github.com/aws-amplify/amplify-js/issues/5119#issuecomment-627718540 would help if Sync request mapping is not changed to use Query or at least extended to use hints or/and query other indexes.

I just hope to be wrong and that I'm missing something obvious (please, please, please). But as it is we're degrading DynamoDB.

kwoxford commented 4 years ago

This is very much what I need, but I have a worry.

My use case is a multi-event app for conferences. The (anonymous) user is shown a list of events and chooses one which is then loaded into the app; they can subsequently switch to another event if they wish. Every table has an eventID which can be used in the sync queries.

I envisage the list of eventIDs being pulled in by a call to a REST API when the app launches.

So far so good, but the thing that worries me is switching to another event. Is it going to be possible to call DataStore.config() again with different filters? Will that nuke the store and start afresh (which is my preferred behaviour) or will I end up with data for two eventIDs?

This is a serious real-world scenario based on Apple's recommended solution for apps for the event industry. There's a lot of similar apps out there already.

tslocke commented 4 years ago

@manueliglesias you mentioned the proposed solution allows for "dynamically filtering the sync by what the user does in the app", but the example code you posted doesn't show that.

Can you post some code that shows how you would express a filter for the situation where there are millions of records a given user could potentially view, and they only get synced to the client as the user browses around?

The API you posted doesn't seem to cater for that.

undefobj commented 4 years ago

@chernando Delta Sync uses a DynamoDB Query and the Base Sync uses a Scan. The Base is a periodic global catch-up process and doesn't run all the time (it's also configurable). We're working with the AppSync team to add an optimization for using a Query in the service but this would be transparent to the client API since it happens in the GraphQL resolver in the service.

chernando commented 4 years ago

@undefobj I really appreciate your quick response.

@chernando Delta Sync uses a DynamoDB Query and the Base Sync uses a Scan.

Yeah, Delta Sync works as expected.

The Base is a periodic global catch-up process and doesn't run all the time (it's also configurable).

But just one run from time to time can ruin everything if table is big enough. The only way I see to keep it scalable is disabling entirely and that makes DataStore dull.

We're working with the AppSync team to add an optimization for using a Query in the service but this would be transparent to the client API since it happens in the GraphQL resolver in the service.

I have no doubt you're working hard on this and other important issues and don't get me wrong Amplify & DataStore have great potential. And I fully understand you cannot compromise on dates.

For me (and others I presume) these issues are deal-breakers considering DataStore for new projects. And which is worse, you discover these issues once you are already committed.

The docs are awesome, in content and style, but I cannot find anything of these there. Please, in the meantime, document these issues. As @tslocke said https://github.com/aws-amplify/amplify-js/issues/5119#issue-582983868.

dazweeja commented 4 years ago

After reading these comments, I can't say that I'm clear on the use case for DataStore. I couldn't find a doc listing the motivation behind it, so I naively thought it was intended as a convenient wrapper around AppSync that abstracted away dealing with queries, mutations and subscriptions, and incorporated a persistent cache. But it seems like it might be something else where basically the database - or a subset of it - is mirrored on the device and I'm not sure how practical this is for any non-trivial app.

For example, if I was building an app like Twitter with thousands of posts a day, I might have screens with my feed (based on those followed), my own tweets and my liked tweets. I'd probably expect to load down 50 or so initial tweets for each screen and as I scrolled, the app would load more. I wouldn't expect or desire to use my bandwidth or storage on loading any more than that but I would expect to be able to search all tweets, as well as being able to compose or "like" tweets while offline. This sort of use case seems pretty basic to me but I'm not sure how this is covered by the solution proposed by @manueliglesias. I'd need separate filters for the same type but on each screen and ideally I'd like them all to run in parallel on app boot/foregrounding. And also ideally not download any data beyond this unless requested.

A second example might be an Uber Eats style app for drivers, where there'd be filters for the driver's current and past orders, as well as a real-time geolocation filter for available nearby orders. The latter filter would obviously be quite dynamic (as the driver moved) and is possible in AppSync now but I'm not sure how it fits into the DataStore paradigm.

Is it intended that DataStore will be suitable for these sorts of use cases?

undefobj commented 4 years ago

@dazweeja DataStore architecture and motivation is here: https://docs.amplify.aws/lib/datastore/how-it-works/q/platform/js

As noted earlier we have outlined a design for filtering on the sync operations more dynamically which should account for your use cases, as well as more optimizations on the backend (independent of the client changes). You should be able to pass your filters in different lifecycle points of your app, and also use pagination options appropriately. While DynamoDB doesn't support geosearches, Elasticsearch does and you could use DataStore in conjunction with @searchable for geosearches via Elasticsearch and the API category.

With DataStore not only will local operations be faster as the queries happen on device, eliminating any physical constraints of the network, but they'll also continue to work offline and the programing interface is safer compared to manual cache updates.

For the most part what we have outline above on upcoming improvements will address your use cases, with the exception of the optimization around prefetching pagination (DataStore does an eager version with progressive fetching in the background). For that we have a separate initiative later this year to let you use the API category interfacing with DataStore's persistence layer and it could do this extra optimization if needed. However that's a separate concern to this thread so if you'd like to layout requirements for such a feature please open a separate issue.

dazweeja commented 4 years ago

@undefobj that sounds amazing. By motivation, I meant the decision to create an eagerly-fetched datastore rather than a lazy-fetched cache and the practicalities of each approach for a non-trivial app.

I'm sorry for the noob question but I'm still unclear about sync filtering vs queries. Say I have a Twitter app with millions of tweets. First I want to see the the 50 most recent tweets, then the 50 highest rated tweets, then my own tweets in reverse chronological order, then back to the 50 most recent tweets. Would I be changing the sync filter each time and then running a query? And in that case, what happens to the results from the previous filter? Or would I have no filter (because no common filter is appropriate) in which case DataStore might be eagerly fetching a significant number of tweets that will never be displayed? And in that case, if I ran a query and none of my own tweets appeared in the set of eagerly fetched tweets (because they're spread throughout the millions of others), would DataStore run a query against the entire backing db instead and return them?

ltaljaard commented 4 years ago

@undefobj Could you give us a rough estimation on when beta support for multi-tenant support will be coming out? Also, when will it become a Project?

ltaljaard commented 4 years ago

@manueliglesias Hello, could you advise on the expected timelines for this development?

rpostulart commented 4 years ago

How are things going on this topic?

zhenyakovalyov commented 4 years ago

are there any updates on this please?

eripoll commented 4 years ago

Any news on the topic?

beeirl commented 3 years ago

@undefobj Could you please give us a brief status report?

mauerbac commented 3 years ago

Hi all - sorry for the lack the communication here. Just spoke to team and this is currently being worked on. We are excited to roll this out.

SebSchwartz commented 3 years ago

@mauerbac Any updates on this?

ltaljaard commented 3 years ago

@mauerbac @undefobj Can we have a status update on this to help us plan our own project?

rpostulart commented 3 years ago

I have checked it in the community builders group and it should be there in weeks and not months 😁 but maybe we can get a more concrete answer here.

undefobj commented 3 years ago

All - Wanted to provide an update. We have two PRs in review/draft that are pending based on the design discussions that have taken place here as well as with other customers. You can view them here: https://github.com/aws-amplify/amplify-js/pull/7001 https://github.com/aws-amplify/amplify-cli/pull/5586

Thank you for your patience and giving us feedback as we worked through this use case as well as the deployment into the AppSync service to support it.

rpostulart commented 3 years ago

Cool, looks promising!

So when you want to sync multiple models, I expect it works like this, correct

DataStore.configure({
  syncExpressions: [
    syncExpression(Model A, () => {
      return (c) => c.rating('gt', 5);
    }),
syncExpression(Model B, () => {
      return (c) => c.rating('gt', 5);
    })
 ]

});

undefobj commented 3 years ago

@rpostulart Yup!

iartemiev commented 3 years ago

Selective Sync has been released as part of aws-amplify@3.3.5

Please see this section of the docs on how to utilize it.

rpostulart commented 3 years ago

Wow that is great. Thanks for all the effort, I will try it out. I don't understand the docs about when a query is triggered instead of a scan. Is that because a field of a key field is used?

iartemiev commented 3 years ago

@rpostulart thank you! We'll be updating that section of the docs to make it more clear. But you are correct. In order for a Query operation to be performed, you would need to specify a predicate that corresponds to the Hash Key and optionally the sort key of your GSI.

For example, if you have

type User @model
  @key(name: "byLastName", fields: ["lastName", "createdAt"]) {
  id: ID!
  firstName: String!
  lastName: String!
  createdAt: AWSDateTime!
}

The following prediates will result in a query:

(c) => c.lastName('eq', 'Smith');
// OR
(c) => c.lastName('eq', 'Smith').createdAt('gt', '2020-10-20');

The Hash Key has to use the eq operator and the Sort Key can use any of the operators listed here, e.g., eq, gt, lt, le, etc.

If the specified predicate cannot be matched to a GSI, it will be applied as a filter to the Scan operation instead of a Query.

danrivett commented 3 years ago

This is great work, I love that optimization.

Even better would be great if we could explicitly state we only want a query for the sync expression, and it fails on DataStore.start() if the query cannot be satisfied by a query operation rather than failsover to use a scan, as this would avoid a possible costly big bill on large datasets.

Perhaps we can have an optional config object passed in at the end of the syncExpression function? Something like:

DataStore.configure({
  syncExpressions: [
    syncExpression(User, () => {
      const lastName = await getLastNameForSync();
      return (c) => c.lastName('eq', lastName).createdAt('gt', '2020-10-10')
    }, { useQueryOperationOnly: true })
  ]
});

rpostulart commented 3 years ago

With this

DataStore.configure({
      syncExpressions: [
        syncExpression(Quiz, () => {
          return (c) => c.id("eq", "e27c322b-7e92-4c8f-a064-9e5b9f376e7a");
        }),
      ],
    });

    DataStore.start();

Datastore is still syncing the other models too. I expect that it only sync quiz, is that a correct assumption?

This is in my package.json:

{
  "name": "kwizz",
  "version": "0.1.0",
  "private": true,
  "homepage": ".",
  "dependencies": {
    "@aws-amplify/analytics": "^3.2.3",
    "@aws-amplify/api": "^3.2.9",
    "@aws-amplify/auth": "^3.3.1",
    "@aws-amplify/core": "^3.8.1",
    "@aws-amplify/datastore": "^2.7.1",
    "@aws-amplify/interactions": "^3.1.19",
    "@aws-amplify/predictions": "^3.1.19",
    "@aws-amplify/storage": "^3.2.9",
    "@aws-amplify/ui": "^2.0.2",
    "@aws-amplify/ui-react": "^0.2.11",
    "@aws-amplify/xr": "^2.1.19",
    "@testing-library/jest-dom": "^4.2.4",
    "@testing-library/react": "^9.3.2",
    "@testing-library/user-event": "^7.1.2",
    "array-move": "^2.2.2",
    "aws-amplify-react": "^4.2.10",
    "bootstrap": "^4.5.0",
    "depcheck": "^0.9.2",
    "history": "^4.10.1",
    "immutability-helper": "^3.1.1",
    "react": "^17.0.1",
    "react-bootstrap": "^1.2.2",
    "react-bootstrap-table-next": "^4.0.3",
    "react-bootstrap-table2-filter": "^1.3.3",
    "react-bootstrap-table2-paginator": "^2.1.2",
    "react-dnd": "^11.1.3",
    "react-dnd-html5-backend": "^11.1.3",
    "react-dom": "^16.13.1",
    "react-ga": "^2.7.0",
    "react-image-file-resizer": "^0.2.4",
    "react-router-dom": "^5.2.0",
    "react-scripts": "3.4.1",
    "react-select": "^3.1.0",
    "react-spring": "^8.0.27"
  },
  "scripts": {
    "start": "react-scripts start",
    "build": "react-scripts build",
    "test": "react-scripts test",
    "eject": "react-scripts eject"
  },
  "eslintConfig": {
    "extends": "react-app"
  },
  "browserslist": {
    "production": [
      ">0.2%",
      "not dead",
      "not op_mini all"
    ],
    "development": [
      "last 1 chrome version",
      "last 1 firefox version",
      "last 1 safari version"
    ]
  },
  "devDependencies": {
    "react-test-renderer": "^16.13.1"
  }
}

iartemiev commented 3 years ago

@rpostulart this feature allows you to apply a filter to the sync queries of the model(s) you specify, it doesn't affect the behavior of the other models though.

With your configuration, I would expect the Quiz model to only sync down a single record, but all the other models in your schema should keep syncing all the data.

rpostulart commented 3 years ago

Cool! Then I understand the implementation and that works fine!

guy-a commented 3 years ago

This is a wonderful update, DataStore only using scans was a real issue for any reasonable app.

I have 3 question/suggestions

Prevent scans at all - It needs to be done in the backend and without the need to edit the velocity templates or similar. Leaving the API open to scan operations can lead to DOS attacks on our AWS billing.
Redundant PK and SK - Is it possible to generate the DynamoDb table with my own Partition key and sort key and without the need to add an extra GSI? (I.e. PK=“lastName”, SK=“createdAt” instead of the default PK=“id”)
This is not sharing - As wonderful as this update is, I don’t think it falls into the definition of sharing. A sharing will be something like UserA is the owner of Quize1, he gives complete/partial control over Quize1 to UserB. Maybe something like Quize1.owners = [{me…}, {id: 123, ops: ['read', 'update']}, {id: 145, ops: ['read']}]

Thanx :)

danrivett commented 3 years ago

So our team has been building out an app that doesn't even require this latest functionality of selective sync, but we've run into a pretty large impediment to multi-user access: DataStore doesn't appear to work in conjunction with @Auth directives that limit access based on the model containing a whitelist of users who have access, even though we implemented it the way AWS docs outline. Unfortunately the GraphQL subscriptions that DataStore generates to update the data after the initial sync, do not work at all in that scenario.

As an example think of a chat room with multiple members - you want to make sure only those members see the messages as they come in. Our particular example was a timesheet that only the owner and certain approvers should be able to access.

I was thinking that to posters of this thread this may be a big impediment too so wanted to post here to see if any of you had any suggestions especially if you've already designed around this issue.

I wrote up my experience so far in a separate ticket, which I'll link to avoid cross-posting all the details: #7989

grant-d commented 3 years ago

@danrivett unfortunately we were burnt the same way. The marketing of DataStore didn't match the deliverable - we had moved from Firebase believing (due to the pitch) that we'd find at least a foundation of something similar on aws. In the final hour we had to roll our own sync. The initial design here that expects mobile app to sync *.* was a surprising v1 decision. There's been some movement of late but we haven't looked at it again.

aws-amplify / amplify-js

DataStore: support for multi-tenant apps with sharing? #5119