google / GoogleSignIn-iOS

Enables iOS and macOS apps to sign in with Google.
https://developers.google.com/identity/sign-in/ios
Apache License 2.0
472 stars 189 forks source link

`GIDSignIn.sharedInstance.signIn` does not return a callback sometimes #378

Closed doctorsaway closed 5 months ago

doctorsaway commented 5 months ago

Describe the bug

The issue is, the GIDSignIn.sharedInstance.signIn method does not return a callback sometimes. This bug is a very inconsistent to reproduce. But here are the steps:

To Reproduce

Steps to reproduce the behavior:

  1. Call the signIn method in swift code like so:
    GIDSignIn.sharedInstance.signIn(
    with: configuration,
    presenting: rootViewController,
    hint: nil,
    additionalScopes: ["https://www.googleapis.com/auth/gmail.readonly"]
    ) { result, error in
    // Inside the closure
    }
  2. Observe that closure (callback) never gets executed sometimes, but the google sign in sheet is dismissed.

Usually, this occurs the first time during an app-run and the second time, it works as expected. As a result, the implementation has been flaky in our application, leading to users having to sign-in twice each time (most of the times, since it occurs randomly, there are times when the first sign in attempt works as expected)

Expected behavior

The GIDSignIn.sharedInstance.signIn method to work properly and execute the callback consistently, even if there is some error.

Screenshots

N/A

Environment

Additional context

Seems like this has already been mentioned here: https://github.com/google/GoogleSignIn-iOS/issues/360 but without much context. I have added enough context, hoping to help debug this better.

mdmathias commented 5 months ago

Hi @doctorsaway this isn't reproducible on our end. Please take a look at our example apps to see how you might use this functionality. If there is something that is different, or something that you believe we can address, please reply with more exact reproduction steps (for example, providing context on how your call to -[GIDSignIn signIn...] is made would be helpful).

marchy commented 4 months ago

@mdmathias ⚠️ P1: Blocking users from logging in! We are experiencing the same thing, except that it is not intermittent has overnight stopped working altogether. We've had no change to the same code that was working yesterday, but suddenly today there is no callback firing anymore.

Neither the success nor Cancel scenarios after the Google Sign-in web view is shown fire the callback anymore, however if you click 'Cancel' on the initial alert that is shown prior to the web-view being shown, the callback IS fired (ie: it's definitely a Google SDK issue!)

Google auth dialog

NOTE #1: We tried both async and callback APIs and neither are continuing past the initial invocation. NOTE #2: We tried cleaning/rebuilding the project did not change anything NOTE #3: We ensured our client ID matches and resolves correctly at runtime via GIDSignIn.sharedInstance.configuration!.clientID

Invocation code:

func initiateGoogleAuth_async() async throws -> User {
    do {
        let result:GIDSignInResult = try await GIDSignIn.sharedInstance.signIn( withPresenting: self )

        // authenticate
        let user:User = try await self.accountService.authenticate( googleUser: result.user ) //  <--- never gets here

    } catch let error {
        print( "ERROR: \(error)" ) // <--- never gets here
        ...
    }
}       

func initiateGoogleAuth_callback( onSuccess:@escaping(User) -> (), onError:@escaping(any Error) -> () ){
    GIDSignIn.sharedInstance.signIn( withPresenting: self ){ (result:GIDSignInResult?, error:(any Error)?) in
        guard let result else {
            print( "ERROR: \(error)" ) // <--- never gets here
            ...
            return
        }

        // authenticate
        let user:User = try await self.accountService.authenticate( googleUser: result.user ) //  <--- never gets here
    }
}

Could this have anything to do with some security/configuration for the project not being correctly set?

Our OAuth consent screen config in the Developer Console is set to use the publishing status of Production, user types as External and not token grant rate limit has been set.

marchy commented 4 months ago

UPDATE: The callbacks started working again after over an hour of observed downtime. UPDATE: The callbacks stopped working again as of 6:45pm UTC.

This is quite concerning as it seems there might be Google back-end issues that the library is not handling. If there are errors they should be surfaced up through the error callback not swallowed and app left in an undetermined state (ie: neither succeeded or failed)

ASK: Could the team investigate if any back-end issues were experienced on Apr 11, 2014 prior to 3:30pm UTC (time of my prior post). By 5pm UTC the callbacks started resuming again.

Please keep us posted on this as it is a total blocker to user sign-up / acquisition if this happens intermittently, and we'd like to understand how pervasive this is (ie: % of SLA to expect)

marchy commented 4 months ago

Got down to the bottom of it!

PROBLEM:

Chased this further down and it seems that within GIDSignIn::addCompletionCallback(..) (line 859) the self->_currentOptions.completion line fails the nil check (in fact _currentOptions itself if nil), so it somehow loses reference to the callback passed into the initial SDK by the time the full auth sequence finishes.

GIDSignIn-CallbackBug 1

Earlier in the chain it seems _currentOptions is nilled out, as the flow goes into a "non-interactive" branch of the logic

GIDSignIn-CallbackBug 2

The line on 547 has nothing to invoke since options.completion is nil at that point.

CAUSE:

The non-interactive call is coming from a call to try await GIDSignIn.sharedInstance.restorePreviousSignIn() which is happening because when the Google Sign-In web view is showing, it re-invokes the app's SceneDelegate::sceneDidBecomeActive(..) method which is where we refresh data including user data (which itself attempts to restore the prior Google sign-in).

This in effect was clearing out the callback that was initially passed in at the start of the sign-in flow – and the intermittency of it came from the lag of the user picking their account VS the app sync process finishing earlier in the background.

QUESTION:

Is there any way to know when the scene is being displayed due to the Google sign-in web view so that we can opt out the restorePreviousSignIn() check in this scenario?

SOLUTION:

Architecturally, the library should prevent separate simultaneous operations from sharing state about their parameters and thus trampling on each other.

@mdmathias Could you re-open this ticket? It is a library mis-implementation to share parameters among multiple async operations. It's one thing for them to share user/auth state, but the parameters of one operation (such as callback continuation) should never be shared/impacted by a subsequent operation to either the same API or another one as in this case.

As it stands, we have to implement a ton of extra complexity (such as concurrent operation queues) to make sure we don't call GIDSignIn.sharedInstance.restorePreviousSignIn() while GIDSignIn.sharedInstance.signIn() is in progress and vice-versa.

mdmathias commented 4 months ago

Hi @marchy. Thanks for the posts and your analysis. Sorry that you're having trouble.

First, let me confirm that I understand your issue. From what I gather, it seems like you're experiencing a kind of race condition due to simultaneous calls to restorePreviousSignIn() and signIn(), where the restore call is setting _currentOptions to nil prior to the sign in call's use of _currentOptions.completion to add the completion callback. Therefore, the sign in flow in your app will occasionally not issue a callback due to this race condition. Seems like this line would not pass in this scenario, which would lead to no callback. Is that right?

If my understanding is correct, then it is unfortunately an error to simultaneously call signIn() and restorePreviousSignIn(). From the library's point of view, it doesn't make sense to restore and sign in at the same time.

Even if we made changes to GSI to protect against this error [1], apps would still have to implement some logic to choose which response to handle. For example, which response is the truth? The sign in or the restore? The completions will not have the same information.

The practical solution on your end is to make sure that you are signing in and restoring at different points in your app's lifecycle. Ultimately, it's up to apps to manage this. You can take a look at our Swift sample and ObjC sample for inspiration.

[1] I haven't fully thought through the feasibility of your request, but I believe it may require significant refactoring of the library. We'd have to (potentially re-) consider GIDSignIn's singleton status, manage multiple async callback queues on behalf of clients wishing to issue simultaneous and duplicative requests, etc. While I can see ways that this would be useful to callers like yourself (given the scenario you outline above), I have to admit that this sort of change is not feasible at the moment. Of course, since the repo is open source, if you believe you have a simple fix to resolve this, then please send us a PR and we would love to review it!