aws-amplify / amplify-swift

A declarative library for application development using cloud services.
Apache License 2.0
456 stars 198 forks source link

User data saved in DataStore sometimes not synched to DynamoDB #1131

Closed jacobsapps closed 3 years ago

jacobsapps commented 3 years ago

Describe the bug I've been using Amplify to set up the backend for my startup. We launched a few weeks ago, but when I compared our analytics and Cognito to the Users table in DynamoDB a large number of users who created accounts were missing.

To Reproduce It's unclear how this bug could be reproduced, since it's only happening to a number of users in production. It seems like the 'black box' nature of the sync engine

Environment(please complete the following information):

Device Information (please complete the following information):

Additional context To re-iterate, I am seeing all users created correctly in the Cognito user pool and our analytics (Mixpanel), but many of these are missing from the DynamoDB database.

Our architecture for our iOS app involves a single User object in which all the app data for a user is contained, as per the recommendations given on the Amplify documentation about using document-based databases for app user data such as DynamoDB.

This user is created locally with and stored in the DataStore before they create an account (with the ID "local-user-id"), as the user completes some tasks before they are invited to create an account. Once an account is created, the User object is re-initialised using their Cognito ID as their ID and saved to DataStore, with the expectation that this should automatically sync with DynamoDB. The old user object is deleted.

Users with the ID “local-user-id” are blocked by a sync expression I set up when initialising Amplify and deleted when the new user object with the Cognito ID is made:

let models = AmplifyModels()
        let apiPlugin = AWSAPIPlugin(modelRegistration: models)
        let syncExpression = DataStoreSyncExpression.syncExpression(AUser.schema) {
            AUser.keys.id.ne("local-user-id")
        }
        let dataStorePlugin = AWSDataStorePlugin(modelRegistration: models, configuration: .custom(syncExpressions: [syncExpression]))

Data flow in my app works through a listener to the local DataStore Publisher, so the app can reactively respond to changes made to the stored User object through sync or local user actions. As a result, only one user object can be stored at once. When a user logs in, I make an API call to fetch the latest user data (through REST API + API Gateway + Lambda) as I ran into separate problems with the Amplify API library which aren't relevant. When a user logs out, the local DataStore is cleared.

To reiterate, it's not entirely broken - many users are being saved on the backend, but a significant proportion are not. This has proven hard to debug due to the ‘black-box’ nature of the sync - there don't seem to be any APIs to control it, or force it to attempt to trigger a sync, outside the sync expression when first setting it up.

My only other idea is; perhaps there could be a problem with the data model between the app and backend? During development if the app is behind then it fails to sync anything, but I don't see this being a problem that only affects some users.

ruiguoamz commented 3 years ago

Hi, @jacobsapps

Thanks for reaching out. Sorry to hear such a problem happens on your project.

It makes us difficult to help out without reproduction steps, or error logs.

So here are several questions:

  1. What does your schema look like?
  2. Is User Model Type protected by @auth directives? The reason I ask this is: is it possible the User model is deleted by accident?
  3. And our library provides DataStore Hub Events that once you save a model, DataStore sends back OutboxMutationEnqueued and OutboxMutationProcessed eventv to indicate whether the model has been sent to cloud or not. Hope this helps you debug easier.
  4. And I am a bit confused by your SyncExpression, your App is not receiving any User model which ID is "local-user-id" which means there is no User model with ID "local-user-ID" locally. How are you able to delete the user model with ID "local-user-ID"?
jacobsapps commented 3 years ago

Hi @ruiguoamz, thanks for getting back to me.

Our schema is one User @model object with every sub-object aggregated under it, there are no other @model objects or relational connections for now.

The User model object is protected with @auth:

@auth(
      rules: [
         { allow: owner, operations: [create, read, update] }
      ]
  ) 

I wasn't aware of DataStore hub events, this should give me another tool to try and debug it with. I'll think about how I can log these in production and look for errors.

The SyncExpression is merely to prevent the local User object from being synchronised with the DynamoDB backend - so we don't end up with every single user's data being saved with ID local-user-id separately to their real data. The local-user-id user is created locally, and the SyncExpression is intended to stop it from ever leaving the device.

Thanks again for your help

lawmicha commented 3 years ago

Hi @jacobsapps, so as I understand, you are creating a User object with the id "local-user-id", which gets updated in the local store, and you also prevent this object from being synced to the cloud using the sync expression. Once the user has been authenticated, you update the id in the user object, and save to DataStore. When you perform Amplify.DataStore.save(updatedUser), i believe it will create a new user in the local store, which leaves the old user object still in the local store. This sounds like it is fine to do since the user with "local-user-id" is never synced to cloud.

As @ruiguoamz mentioned, there are hub events which you can listen to handle sync failures for this critical user data, but how should you retry? what if the internet connection is down indefinitely until the app is launched later, how does the app retry in the future? I think the app needs two pieces of logic:

  1. On sync failure hub event, attempt to save the user to DataStore again, causing an update to the user in the local store, and sync engine will sync the user to the cloud again.
  2. On app start up, once the DataStore has finished syncing User models, and if there is an authenticated user, query by the authenticated user id. If the User object does not exist from the local store, you should try and perform the Amplify.DataStore.save(user) again. the user does not exist in local store because it may have failed syncing to the cloud before.

There's another option which bypasses DataStore completely which is to persist the data directly with Amplify.API, there are some caveats to this such as making sure the selection set of the GraphQL document matches what DataStore is expecting the format of data to come back as, since a successful mutation will trigger a sync from cloud to the local store for that user object, subsequently a DataStore publisher event. The trade-off here is the simplicity in handling failures when persisting critical data. You can retry immediately when Amplify.API.mutate(.create(user)) fails, and based on the failure to retry or not. This requires a bit of alignment to make sure it is interpolating with DataStore correctly. Let us know if you are able to get it working.

jacobsapps commented 3 years ago

@lawmicha thank you so much for the well thought-out reply.

For #1 - I really like the first idea about listening to sync failures and retrying. I'll build out this functionality soon. It was to my understanding though that one of the benefits of the DataStore sync engine was that it would handle any retries needed automatically - is this not actually the case, do all retries need to be manually sorted; and is re-saving the only way to trigger this?

For #2 - Right now I do try to fetch the authenticated user from a simple REST API call in order to ensure the latest user data is present (I avoided Amplify API for reasons explained below). Are you suggesting that if there is no remote user, I attempt to save the local user object again, with the hope that it would cause the sync engine to trigger and save the user object properly?

At the risk of getting sidetracked, I have use Amplify API before in this project but I ended up removing it because it seemed to cause big problems related to the DataStore publisher - I'm not sure if it was caching things there; but it looked like after fetching another user's data (e.g. for my "Friends List" functionality) it would start showing the other user's data where the logged-in user's data should be. But I might look back into this suggestion if I'm unable to get results from DataStore.

Something else I'm planning to try as well is, removing the syncExpression and local-user-id - this will prevent any user data being saved locally before an account is created; but building it felt like I was fighting against the system. I'm theorising it's likely to be the root cause of the problem.

It may take some time to get back with results as I've struggled to consistently replicate the issue.

jacobsapps commented 3 years ago

@ruiguoamz, thanks again for the earlier response. I've been looking at the DataStore Hub events and I'm trying to reason out the retry architecture, as there don't seem to be any syncFailure events.

Right now I'm setting it up so: outboxMutationEnqueued causes a retry event to be set on a timer outboxMutationProcessed cancels the timer so only events that fail or time out will be retried

Does this sound sensible to you?

jacobsapps commented 3 years ago

I think I've noticed something while working on this which might be very relevant:

The object fetched from my API or saved via datastore events has _version, _lastUpdatedAt and _deleted properties on it. The model object I have from the Amplify codegen does not contain any of these properties, and it will not let me add them separately.

When the datastore hub events trigger, I noticed that whenever the saved objects have nilled-out versions of these properties they are not synched correctly. When they do have these properties, the sync works fine. I think this may be the root cause of the issue I was having - sync breaks when the _version, _lastUpdatedAt and _deleted properties are not set correctly on the DataStore model object.

Are you able to give any insight as to why these might be set or not? I am using the code-generated Amplify objects as DTOs with local versions of each object in order to encapsulate my dependency on Amplify. But the code-generated version of the object does not have the properties I am looking for either so I'm unsure how they are set sometimes.

Let me know if you have any further insights on this - I seem to have found the root inconsistency that is causing trouble though thanks to the Hub events.

jacobsapps commented 3 years ago

@lawmicha @ruiguoamz Final update for today (I promise!) - I implemented the retry logic as you suggested based on the Hub events and the initial outcome seems a bit promising (though I've not tested fully) - when saving the main User object locally it seems to queue a sync event, but fails and has to retry a number of times (I added a 5 sec retry timer). After some time, it eventually succeeds.

I noticed the model object which gets stored successfully is the one which has a non-nil value for _version, _lastUpdatedAt and _deleted as I mentioned above. I'm still unsure though what's causing these properties to sometimes exist, sometimes not, especially since there should only be one copy of it on the local store. Unless it's also fetching stuff remotely and doing some kind of conflict resolution?

royjit commented 3 years ago

Thank you for the details, like you said, getting nil values for the meta fields might be the root cause. It can happen if the the User table was mutated out of Datastore. Few questions to clarify:

  1. Do you mutate your backend DynamoDB manually or outside Datastore (by using API maybe?).
  2. You mentioned that you had API before, are you seeing this issue for user apps when you migrated from API to Datastore?
  3. Does your DynamoDB User table contain items without the metadata fields?
  4. Does your DynamoDB has a user item with ID local-user-id?
github-actions[bot] commented 3 years ago

This issue is stale because it has been open for 14 days with no activity. Please, provide an update or it will be automatically closed in 7 days.

github-actions[bot] commented 3 years ago

This issue is being automatically closed due to inactivity. If you believe it was closed by mistake, provide an update and re-open it.

ruiguoamz commented 3 years ago

Re-opening this issue.

github-actions[bot] commented 3 years ago

This issue is stale because it has been open for 14 days with no activity. Please, provide an update or it will be automatically closed in 7 days.

lawmicha commented 3 years ago

Hi @jacobsapps, any updates on this? please see questions here https://github.com/aws-amplify/amplify-ios/issues/1131#issuecomment-819805695

palpatim commented 3 years ago

Closing due to inactivity. Please feel free to reopen and provide additional info as requested in https://github.com/aws-amplify/amplify-ios/issues/1131#issuecomment-819805695