aws-amplify / amplify-android

The fastest and easiest way to use AWS from your Android app.
https://docs.amplify.aws/lib/q/platform/android/
Apache License 2.0
249 stars 117 forks source link

Rapid update of a newly created record breaks Sync Process #1286

Closed sbaxter closed 3 years ago

sbaxter commented 3 years ago

I expect to be able to create a record in the DataStore, immediately update the record, and have AppSync publish the changes without fail; but I am presently experiencing an issue where quickly updated records break the AppSync process.

The Exception encountered reads:

DataStoreException{message=Wanted 1 metadata for item with id = eec4e87b-fd3c-47e0-9785-d1e38021feba, but had 0., cause=null, recoverySuggestion=This is likely a bug. please report to AWS.}

Once this issue is encountered in the application, all subsequent syncing fails and we lose the ability to maintain current data in the DynamoDB tables. There is no impact to the local Datastore, as records can continually be created and updated. There is no indication that this is happening, other than by viewing the logs.

It is possible that I am implementing the update using a Bad Pattern and am open to critique ;).

Observed on Amplify Android 1.17.1 and 1.17.3. Pixel 4a ADB connected device. Android Studio 4.1.2. (Update: Android 10, API 29.)

Update: I tested with both Auto merge and optimistic concurrency. I do not believe it is a back end issue, though. Also, it seems instituting a Thread.sleep of ≈50-60ms between create and update is the sweet spot to let the local transactions to play out without issue.

joekiller commented 3 years ago

This seems similar to https://github.com/aws-amplify/amplify-android/issues/1055 and https://github.com/aws-amplify/amplify-android/issues/812

Also the graphql backend for the above issue is configured to use Auto Merge.

joekiller commented 3 years ago

The logs indicate that our update to the object is being accepted by the Datastore itself but then rejected by the Syncengine because we have a mutation in the outbox but we expect a mutation in the outbox as we mutated it after saving it and getting the result back.

Support suggested we query the object and then mutate however the race condition still exists.

This really caught us off guard because it works fine offline.

We create an object, save it, use the copyOf builder, and save again and get an unrecoverable Datastore exception.

Our update is rejected by datastore/syncengine/Merger.java#L93-L97 while test PersistentMutationOutboxTest:updateEventPostedWhenExistingOutboxItemUpdate indicates this is a safe pattern while PersistentMutationOutboxTest:enqueueIsSynchronized indicates that two mutations of the same object cannot be enqueued.

Our operation appears to be saving the object synchronously prior to executing the second statement so the first test should apply but the second seems to be applying instead.

2021-04-15 18:41:29.184 32098-32301/com.example.datastore.test I/System.out: INFO/amplify:aws-datastore: Mutation outbox has pending mutation for eec4e87b-fd3c-47e0-9785-d1e38021feba, refusing to merge.

Is this code not okay? Because it causes an unrecoverable synchronization error for us predictably.

CompletableFuture<AmplifyBug> createFuture = new CompletableFuture<>();
            CompletableFuture<AmplifyBug> updateFuture = new CompletableFuture<>();

            AmplifyBug createRecord = AmplifyBug.builder().updated(Boolean.FALSE).build();
            Amplify.DataStore.save(
                createRecord,
                success -> createFuture.complete(success.item()),
                createFuture::completeExceptionally
            );

            AmplifyBug createRecordReturned = createFuture.get();

            AmplifyBug updateRecord = createRecordReturned.copyOfBuilder().updated(Boolean.TRUE).build();
            Amplify.DataStore.save(
                updateRecord,
                success -> updateFuture.complete(success.item()),
                updateFuture::completeExceptionally
            );
            updateFuture.get();
joekiller commented 3 years ago

Hello, as github has noted, I have created an example repository to demonstrate the error. Setup the env, push the amplify stuff in and run the test.

https://github.com/joekiller/ampliy-datastore-android-showstopper

changxu0306 commented 3 years ago

Hello, as github has noted, I have created an example repository to demonstrate the error. Setup the env, push the amplify stuff in and run the test.

https://github.com/joekiller/ampliy-datastore-android-showstopper

Hi @joekiller , Thanks for providing the code! I'll try to run it to see what happens there.

richardmcclellan commented 3 years ago

hey @sbaxter and @joekiller, thank you so much for the detailed code sample. I can reliably reproduce this issue on my side. You've identified a scenario that DataStore just doesn't handle correctly. That is:

  1. DataStore.save() a new item
  2. Process the item creation by sending a create mutation to AppSync.
  3. DataStore.save() an update to the same item created in step 1, before step 2 completes.
  4. Process the response from step 2 - this step actually fails because if there is an item in the mutation outbox, we don't replace it, we just silently drop the response, because we think the item in the outbox is more up to date.
  5. Process the item update by sending an update mutation to AppSync. This fails because updates require the _version, which we would normally get in the response for the create mutation, but that didn't happen as described in step 4.

As a short term workaround, you could consider listening for the outboxMutationProcessed hub event, and waiting to send your update until you are notified by that event that the create mutation was successful.

I will definitely look into a solution to try to resolve the actual issue though so that you won't need that workaround.

sbaxter commented 3 years ago

@richardmcclellan thanks for taking a look. I toyed around with the HubEvents as a workaround, but we ended up removing all of our updates in favor of an insert only approach for now. We'll take a deeper look and share more findings once our current development effort finishes.

joekiller commented 3 years ago

Thank you for digging into this @richardmcclellan . I'll review the change Monday.

jamesclement commented 3 years ago

Still seeing this exact issue on current latest release (1.18.0)

Any assistance would be great.

richardmcclellan commented 3 years ago

Still seeing this exact issue on current latest release (1.18.0)

Are you still seeing this issue after a fresh install with 1.18.0? If you updated an existing app that was using 1.17.1 to 1.18.0, then I imagine it might happen again, but I would not expect it to happen on a fresh install with 1.18.0.

shantanu-saha-europcar commented 2 years ago

I got this issue today on 1.36.5

DataStoreException{message=Failed to process java.lang.RuntimeException: DataStoreException{message=Wanted 1 metadata for item with id = fa08593f-17ff-4f10-95f4-147becdbf287, but had 0., cause=null, recoverySuggestion=This is likely a bug. please report to AWS.}, cause=null, recoverySuggestion=Check your internet connection.} at com.amplifyframework.datastore.syncengine.MutationProcessor.drainMutationOutbox(MutationProcessor.java:130) at com.amplifyframework.datastore.syncengine.MutationProcessor.lambda$startDrainingMutationOutbox$1$com-amplifyframework-datastore-syncengine-MutationProcessor(MutationProcessor.java:103) at com.amplifyframework.datastore.syncengine.MutationProcessor$$ExternalSyntheticLambda3.apply(Unknown Source:4) at io.reactivex.rxjava3.internal.operators.observable.ObservableFlatMapCompletableCompletable$FlatMapCompletableMainObserver.onNext(ObservableFlatMapCompletableCompletable.java:97) at io.reactivex.rxjava3.internal.operators.observable.ObservableObserveOn$ObserveOnObserver.drainNormal(ObservableObserveOn.java:201) at io.reactivex.rxjava3.internal.operators.observable.ObservableObserveOn$ObserveOnObserver.run(ObservableObserveOn.java:255) at io.reactivex.rxjava3.internal.schedulers.ScheduledRunnable.run(ScheduledRunnable.java:65) at io.reactivex.rxjava3.internal.schedulers.ScheduledRunnable.call(ScheduledRunnable.java:56) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:301) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641) at java.lang.Thread.run(Thread.java:923) 2022-09-21 12:37:07.487 19595-19825/com.emobg.emobgops I/amplify:aws-datastor

shantanu-saha-europcar commented 2 years ago

Wanted 1 metadata for item with id

Item is available in both local table and server but no entry in ModelMetadata. _version: 1 For testing, I added the entry in the ModelMetadata table manually, sync is okay now.