aws-amplify / amplify-js

A declarative JavaScript library for application development using cloud services.
https://docs.amplify.aws/lib/q/platform/js
Apache License 2.0
9.38k stars 2.1k forks source link

DataStore upgrading from 2.2.6 to 2.2.7+ breaks realtime data #6580

Closed mdoesburg closed 3 years ago

mdoesburg commented 3 years ago

Describe the bug After upgrading the DataStore from version 2.2.6 to 2.2.7, the following error is printed in the console right after the DataStore is ready:

[DEBUG] 46:46.527 AWSAppSyncRealTimeProvider - closing WebSocket...

DataStore version 2.2.7 introduces some bug fix I need:

This issue persists even in the latest DataStore version (2.2.9), and while having all the other Amplify packages up-to-date.

I have these packages installed:

Downgrading the DataStore version to 2.2.6 fixes this issue, but leaves me without the bug fixes.

Expected behavior I expect realtime data to keep working after upgrading to the latest version of the DataStore.

iartemiev commented 3 years ago

Hey @mdoesburg which framework are you using, e.g., React, RN, Expo?

mdoesburg commented 3 years ago

@iartemiev React Native

iartemiev commented 3 years ago

@mdoesburg so you're using the RN CLI specifically, not Expo, correct?

mdoesburg commented 3 years ago

@iartemiev Yes, RN without Expo.

iartemiev commented 3 years ago

Got it, thank you. I will attempt to reproduce and will get back to you if I need more info

iartemiev commented 3 years ago

Wasn't able to reproduce. Tried multiple sample apps on React Native and syncing is working correctly for me on @2.2.9

Could you please share your schema?

mauerbac commented 3 years ago

ping @mdoesburg

mdoesburg commented 3 years ago

@iartemiev @mauerbac Sure! Is there a way for me to share it privately?

iartemiev commented 3 years ago

@mdoesburg one way would be to create a private repo with your project (or whatever part of it you're willing to share and is enough to reproduce this issue) and then add me as a collaborator. I will then be able to clone it, create a new cloud environment in my AWS account, and debug the problem you're experiencing.

mdoesburg commented 3 years ago

@iartemiev I've invited you to a private repo that contains my schema.

iartemiev commented 3 years ago

@mdoesburg got it. Thank you! Any chance you could also include some of the code so that I could just run it and reproduce the issue as quickly as possible?

mdoesburg commented 3 years ago

@iartemiev I know I am making it harder to debug this problem, but unfortunately, due to company policy, I am not allowed to share the codebase.

This problem pretty much occurs right away after the DataStore is ready. The first subscription that gets started when a user logs in is a subscription on the Contact model:

const subscription = DataStore.observe(Contact).subscribe(() => {
  // log some data
});
mdoesburg commented 3 years ago

@iartemiev If you can't reproduce this issue with the shared schema, and after subscribing to any model, then I guess I'll have to further investigate the code changes in DataStore 2.2.7 and see if any other code I have in my codebase could cause this websocket disconnect.

iartemiev commented 3 years ago

@mdoesburg no problem, I totally understand! I'll create my sync-enabled API from your schema and then will just try interacting with the Contact model as you're suggesting. Will report back

mdoesburg commented 3 years ago

@iartemiev Thank you!

iartemiev commented 3 years ago

@mdoesburg I was able to reproduce the issue you're describing. What's interesting, is that if I remove all of the @auth rules from the schema, I no longer get those errors. I'll be investigating the cause today and will be working towards a solution. WIll update this issue once I've made progress.

iartemiev commented 3 years ago

@mdoesburg I've had a chance to spend some time with your schema.

The sync issue is resulting from some of the generated GraphQL queries and mutations when defining specific operations for owner auth, e.g., { allow: owner, operations: [create] }, like you have in your Question model.

If I change the auth rules for Question to { allow: owner } everything seems to be working.

By the way, I'm using the latest unstable versions of Amplify, as we recently merged a PR that fixes a bug with static group auth.

We are currently prioritizing validating the various @auth rules and fixing any related bugs. I will let you know when the operations argument is fully functional with DataStore.

In the meantime, please let me know if changing the auth rule for Question and using the latest unstable versions of the library gets your project syncing.

Also, I also tested this issue: https://github.com/aws-amplify/amplify-js/issues/5290 with the latest aws-amplify@unstable and I was able to get it working when toggling the gender field on the Contact model. I.e., two browsers open, one logged in with the owner, the other with a different user in the Admin group. Toggling in either browser had the change reflected in both. I've not yet validated that in React Native specifically (was testing this in React) but will do so later today and update the issue with my findings.

mdoesburg commented 3 years ago

@iartemiev Thank you! Really appreciate all the effort you guys are putting in these issues. I'll let you know if it works.

mdoesburg commented 3 years ago

@iartemiev Just installed unstable versions for the following packages (which are the ones I use):

"@aws-amplify/analytics": "unstable"
"@aws-amplify/auth": "unstable"
"@aws-amplify/core": "unstable"
"@aws-amplify/datastore": "unstable"
"@aws-amplify/pushnotification": "unstable"
"amazon-cognito-identity-js": "unstable"

It already seems to be working without making the suggested schema change. So to be clear, it seems to already work without modifying the Question auth directive.

iartemiev commented 3 years ago

@mdoesburg that's great to hear! I would suggest keeping an eye on the logs and if you see errors from AWSAppSyncRealTimeProvider mentioning onDeleteQuestion, onCreateQuestion, etc. I would try adjusting the schema.

iartemiev commented 3 years ago

I'll go ahead and close this issue if you don't object.

If you happen to run into more problems with sync using that same (or a similar) schema, please open a new issue and reference this one. I'll hold on to my sample app with your schema, so I'll be able to debug any future issues quickly (should they arise).

mdoesburg commented 3 years ago

@iartemiev Alright, noted. Thank you so much for your help!

mdoesburg commented 3 years ago

@iartemiev Yeah go ahead and close it.

iartemiev commented 3 years ago

@mdoesburg one more thing I forgot to mention. When saving a new Contact without arrangements, I would use the following empty values:

arrangements: [null],
arrangementIds: [''],

As there's currently a bug with optional arrays. It will be fixed soon as well, and you'll be able to just omit them if they're optional in the schema.

mdoesburg commented 3 years ago

@iartemiev I thought everything was working, but running a mutation from the AWS console didn't show the changes for a user in the app, which it did before. This is after changing the schema as you suggested. Did you change the User model at all? That's the only one that still gives warnings after making the suggested changes.

mdoesburg commented 3 years ago

@iartemiev So to summarize:

mdoesburg commented 3 years ago

@iartemiev Also I am aware of the optional array bug, and do something very similar to what you posted. :p

iartemiev commented 3 years ago

@mdoesburg I did not make any changes to the User model. The only thing I changed in the final test of this was I removed the identityClaim arguments with the custom field (on all models), just to speed up the set up on my end. I don't think that would be causing this issue though.

Were you performing a mutation on the Contact model via the AWS Console?

iartemiev commented 3 years ago

@mdoesburg I will attempt to reproduce with an RN app specifically, checking to see if mutations executed via the Console are detected in the app's subscription.

mdoesburg commented 3 years ago

@iartemiev Sorry I am just walking through all the steps again to see if I missed anything. I'll post a reply in a couple minutes.

iartemiev commented 3 years ago

No worries! Take your time

mdoesburg commented 3 years ago

@iartemiev Ok, I have some bad news. I still get AWSAppSyncRealTimeProvider - closing WebSocket... even when I allow all operations for owners on the Question model. I forget to set the log level to debug so this was probably the case earlier too. 😂

iartemiev commented 3 years ago

@mdoesburg Are you seeing any specific errors in the logs? When I was testing this, a few times I'd see the closing Websocket message but without any other errors and if I performed a DataStore operation (e.g., creating a Contact), it would still sync fine

mdoesburg commented 3 years ago

@iartemiev Yeah I get Not Authorized to access onCreateUser on type Subscription for onCreateUser, onUpdateUser, and onDeleteUser, but I don't know if that might be the cause.

mdoesburg commented 3 years ago

And right after one of those: AWSAppSyncRealTimeProvider - Error while unsubscribing undefined

mdoesburg commented 3 years ago

@iartemiev Should I share the whole log in the repo I created?

mdoesburg commented 3 years ago

@iartemiev I went ahead and added the log to the private repo. Let me know if you need anything else

iartemiev commented 3 years ago

Are you logged in with a user pool user?

mdoesburg commented 3 years ago

@iartemiev That log reflects the following:

iartemiev commented 3 years ago

@mdoesburg if you log in and then reload the app, are you still seeing the Not Authorized to access onCreateUser on type Subscription error? Or any other errors?

mdoesburg commented 3 years ago

@iartemiev Yes, I still get those errors, and there is no other errors. (after reloading)

iartemiev commented 3 years ago

@mdoesburg I've done some more testing, this time specifically in RN.

I also ran into issues performing a createContact mutation via the AppSync Console, however, I believe it's unrelated to the issue at hand. I saw the following warning in my instance of React Native Debugger:

[WARN] 20:52.879 DataStore - Skipping incoming subscription. Messages: Cannot return null for non-nullable type: 'AWSTimestamp' within parent 'Contact' (/onCreateContact/_lastChangedAt)

It appears the subscription is sent before all of the necessary data has been populated. I'll dig into that some more.

However, when I tested by running 2 iOS simulators side by side, the sync was working as expected.

I've pushed some files to the repo you shared with me.

The mdoesburg6580RN directory contains a very rudimentary sample app I wrote to test this issue using your schema (as mentioned above, I removed the identityClaim arguments, but everything else is the same).

If you'd like to run it, you can create a new cloud environment for it by executing:

amplify init
? Do you want to use an existing environment? No

Followed by amplify push

The assets directory contains a screen recording I made where I'm running this sample app on 2 simulators, as well as screenshots of the users I have configured in my Cognito User Pool. You'll see that one of them (ivan2) belongs to the Admin group and the other (ivan) is not associated with any group.

In the demo, each simulator is using a different user. Static group auth works as expected - ivan2 receives items that he creates, as well as the ones created by ivan, whereas ivan can only see his items, as expected with owner auth.

I also toggle the value in one of the fields in the model, just like I mentioned in a previous comment. This change is reflected in both simulators.

Please feel free to make any changes you'd like to the sample app. If you can get it to where sync is failing in that sample, please push your changes and let me know. I should then be able to quickly reproduce and debug them on my end.

mdoesburg commented 3 years ago

@iartemiev Ran your sample app with the same exact setup (2 simulators, 2 users etc.), and I am getting the same results as you. So that seems to be working, which is great news! :)

I am going to try and update the packages again on my real app to see if I still get the websocket disconnect. The only difference I see right off the bat is that your sample app uses a newer RN version and newer @react-native-community/netinfo version.

Package Real App Sample App
react-native 0.62.0 0.63.2
@react-native-community/netinfo ^5.8.1 ^5.9.6

I don't see any changes in the RN changelog that could be related to this issue, but maybe a newer netinfo version might help?

I'll let you know if I am still getting the disconnect. The other issue can be closed though!

iartemiev commented 3 years ago

@mdoesburg glad to hear that! Hopefully, we're on track to getting this working for you 🙂 I'll go ahead and close #5290

mdoesburg commented 3 years ago

@iartemiev How come on the sample app we don't get the unauthorized error for the User model?

iartemiev commented 3 years ago

@mdoesburg Just tried it and if I attempt to create a User while signed in with my user that doesn't belong to the Admin group, I get the following error:

[WARN] 41:57.552 DataStore 
{
  localModel: Model {id: "0b0f00f1-2e41-41a9-a0ab-e1339cd0d908", token: "", _version: undefined, _lastChangedAt: 
  undefined, _deleted: undefined}
  message: "Not Authorized to access createUser on type User"
  operation: "Create"
  errorType: "Unauthorized"
  errorInfo: null
  remoteModel: null
}

When signed in with the other user (the one belonging to Admin), I'm able to create a User successfully, which is what I would expect given the @auth rules defined on the model.

Are you seeing a different result?

mdoesburg commented 3 years ago

@iartemiev Your right in your observation that the mutations are properly protected, but the issue is that I instantly get this when the DataStore starts and sets up all the subscriptions:

[WARN] 32:54.749 DataStore - subscriptionError Connection failed: {"errors":[{"errorType":"Unauthorized","message":"Not Authorized to access onDeleteUser on type Subscription"}]}

When I change the User model to this:

type User
  @model
  @auth (rules: [
    { allow: private },
    { allow: private, provider: iam },
    { allow: groups, groups: ["Admin"] }
  ])
...

After adding the { allow: private } rule as shown above, I no longer get the websocket disconnect and everything works.

These specific changes are causing this problem for me. Now I recognize that handleDisconnect was never actually called before :p but it did work before that change.

As shown in my log, it logs the following for the onCreateUser, onUpdateUser, and onDeleteUser subscription right off the bat:

Error while unsubscribing Error: Subscription never connected

Which happens on line 208. In the finally block the _removeSubscriptionObserver method gets called, which in turn calls _closeSocketIfRequired, and finally that method logs:

logger.debug('closing WebSocket...');

It seems that with these changes, if any subscription error occurs, the websocket gets closed?

You can see in my log that all the subscriptions return "type": "complete" right after all the subscriptions get started, instead of the regular keep alive type "type": "ka":

[DEBUG] 32:54.908 AWSAppSyncRealTimeProvider - subscription message from AWS AppSync RealTime: {"id":"a7d24d8e-3182-4ea6-bcce-8cdb905f2b39","type":"complete"}

Is it intended to cancel all subscriptions if one fails?

mdoesburg commented 3 years ago

@iartemiev To answer my own question, this is clearly not intended considering the _closeSocketIfRequired method starts with an if statement to check if there is any active subscriptions:

if (this.subscriptionObserverMap.size > 0) {
  // Active subscriptions on the WebSocket
  return;
}

So the question is why it actually thinks there is no more active subscriptions, and thus closes the websocket?

iartemiev commented 3 years ago

@mdoesburg

but the issue is that I instantly get this when the DataStore starts and sets up all the subscriptions:

Got it, sorry I had misunderstood originally. I'm confused about why those User subscriptions are erroring out in your app but are working fine in the sample...

These specific changes are causing this problem for me. Now I recognize that handleDisconnect was never actually called before :p but it did work before that change.

Exactly (: it was a false negative before.

So the question is why it actually thinks there is no more active subscriptions, and thus closes the websocket?

I'm going to look into this some more and get back to you.

mdoesburg commented 3 years ago

@iartemiev

Got it, sorry I had misunderstood originally. I'm confused about why those User subscriptions are erroring out in your app but are working fine in the sample...

This boggles my mind too... I get no User subscription errors in the sample app, and I am using the same package versions...😂

I've been adding some console.log statements to AWSAppSyncRealTimeProvider.js, but can't really seem to figure out what the issue is. I do see that _sendUnsubscriptionMessage is called for all subscriptions right after this subscriptionError is logged:

[WARN] 45:52.384 DataStore - subscriptionError Connection failed: {"errors":[{"errorType":"Unauthorized","message":"Not Authorized to access onCreateUser on type Subscription"}]}

Which explains why the websocket is closed. Basically all the subscriptions send the GQL_STOP message, and all get closed, thus the websocket closes since this.subscriptionObserverMap.size then equals 0.

I'm going to look into this some more and get back to you.

I really appreciate the help! Hopefully you can find out more then I did.