Closed DFurnes closed 7 years ago
It could be refusing to allow the user to log in if their state token isn't correct and they fail the access check. Maybe there's an issue where this user's session is getting lost during the redirect? Perhaps if they get routed to a different webhead?
I just tried manually using an authorization code from a recent 404ing URL, and Northstar says the code is revoked. That seems to indicate that either the returned code never worked or was already used by Phoenix by the time the page 404ed... π£
Just confirmed that this is specifically a Phoenix issue, Sarah was able to log into Aurora just fine. π§
Looking at the newly added logging when trying to log in to Sarah's account⦠nothing looks out of place on the Northstar end & no unexpected exceptions are triggered:
[2017-02-08 19:50:36] production.DEBUG: created new auth code {"code":"36acf8165470dd3d20084e7cec535e59d0ce674f5d070f7c42b3c8f47b013d73e5f014fd40fa4029","user_id":"584b130fa0bfad58c2231c02","client_id":"phoenix-oauth"}
[2017-02-08 19:50:36] production.DEBUG: checked auth code {"code":"36acf8165470dd3d20084e7cec535e59d0ce674f5d070f7c42b3c8f47b013d73e5f014fd40fa4029","exists":true}
[2017-02-08 19:50:36] production.DEBUG: revoked auth code {"code":"36acf8165470dd3d20084e7cec535e59d0ce674f5d070f7c42b3c8f47b013d73e5f014fd40fa4029"}
More context from the Trello card:
Sarah's account is fixed now, from changing the email on her old Phoenix UID and re-creating a new one via the normal OpenID Connect flow. I need to do a little more looking but I think that should help figure out the issue for the other affected accounts.
So I don't think it's related to a bad auth code⦠it seems like there was some sort of issue resolving the existing Drupal ID and saving Northstar fields to it (and hence why deleting the old user was the fix). Hopefully the updates in #7320 do the trick.
This is unfortunately still happening. I just did a audit of places where we'd end up at 404 page:
?code
is not provided in URLdrupal_goto('user')
which would just bring you to the "login" page...It still seems like the most reasonable explanation is that the state token doesn't match, but I'm having a hard time figuring out how that'd ever happen to a well-intentioned user.
I've spot checked each of the accounts that Hannah had flagged with this issue in Zendesk. All of them have active accounts & correctly synced data for email
, northstar_id
, and drupal_id
in both Northstar and Phoenix.
A little interesting: one account shows user_registration_source
as ` (empty!) and [another](https://www.dosomething.org/us/user/9033150) shows
quasar-etl-node`... whaaaaaat? Super weird, but not sure it's relevant since the rest is okay.
I merged in a patch to the OpenID Connect module in #7341 which will allow us to see whether the issue is the state token (my current hunch!) β it'll log the expected/provided values & users will be given a flash message so we can identify the problem in support requests.
Some issues were reported with the "We couldn't verify this login request" message being displayed unexpectedly (when changing an image on a campaign node). This is suuuuper strange since that should only be running on the access check for the authentication page.
Noting here, but gonna assume this is a π» for now...
The error message was also displayed at least once during a Ghost Inspector login test. Neither me nor @ngjo were able to reproduce, and the same issue hasn't occurred since...
@DFurnes For the issues that were reported when changing the image on the campaign node, it continued to happen for certain folks but not me. I don't know if it's at all related, but here's the profiles:
Anyways, Chris's profile (he got the "we couldn't verify this login request"): https://www.dosomething.org/us/user/3984224/edit
Calvin (also got the message): https://www.dosomething.org/us/user/571235/edit
Mine (I didn't get that message and was able to change the image): https://www.dosomething.org/us/user/3912425/edit
@DFurnes We got another person who said they can't log into their account: UID - 1249515
This is what they're seeing after they try to login
@DFurnes same user UID - 1249515 is trying to login and this is the URL they're getting redirected to:
https://www.dosomething.org/us/openid-connect/northstar?code=U91hsBpSrlLVvr4afvUl1pJu98K8xhUwSrun3Jz5tj4GhW2ESODPR6Pnp2skIBeixd0QRwv8aTAWPIEn8a2hGD12wvE%2FqJdUENdhr1L%2B6Mbwhjb7HNepHSMrFmrsA6ISt5fHwMIK7hcjVrS42ZiNZOVarpO1g9Lz7ObZOfNLKnAOfsi2kbvXtUNJYKXXOZQr95g9%2B%2BfIpg9grJz9tCB9P8M60FwF30tWU2cav45H22gtQzg0zWh2%2FCBvtYY5W56FdExgvgyl3u%2B7aGsMKq2FlBxpIPs2mQcZ%2F0%2FGfbbNwR4wDR5Yv%2Bu09pc4TOvl02HLbmSFrOuADGhMscGEqHVrAFhCe8RBbu2Iew77t4RPRu7QDzgXDTzLQj%2F%2BtBIxc5ha8mSDnx656VarUImuvwmWjDrXZH%2FTWIB9YA6CwcZy3pHKgiIjVgF%2B94TElNdserc0SzUTOt%2BrweBH4oQeHM%2Faj8hrFyDLBh0SjY2q%2FiEu%2FkfeQE9YMgxhDsU%2Bb4ZKgTveXxnOkzawhWGyOtN5%2BDdVtHuSBBD5Z6KwbD3E472JtgrPaF%2Bd%2F2laboxUbfrRyVTWEx0LRQGxAjn%2BZjxJyk2%2BX%2B1r5OdjshbIG5ycAkZJwDQFZ5Ghnn1RQqBenxu%2Bso8vo0vwOYWdfiaDLy4yUgkOEnLtbSRve8LDS2MZmeVa76s%3D&state=43d93f14583ad0620df7524cb09d00c0
with this error message in purple "We couldn't verify this login request. Please try again!"
The member went on to say that they couldn't even get to the login screen, they click the "login" button and then they immediately get that error message.
And separately, documenting the issue w/ Gladiator login that I mentioned at sprint planning this morning.
Last week sometime, I was trying to log into Gladiator - I've previously been able to do this - and I couldn't login. Shae checked out my privileges in Aurora and it seemed like I wasn't an admin anymore. Not sure how/when this happened.
A separate issue that came up for Hillary as she was trying to log into Gladiator was that she had to log out of northstar and then log back in (Shae provided these instructions for Hillary: Hit https://northstar.dosomething.org/logout and then reload gladiator.dosomething.org and login again)
Shae also added that she usually run into it when logging in and out of different apps that use northstar login
Some stream-of-consciousness debugging for this issue:
same user UID - 1249515 is trying to login and this is the URL they're getting redirected to:
Interesting, looking at the log message that we added for the state error):
The state token did not match: , 43d93f14583ad0620df7524cb09d00c0
A couple interesting things to note:
openid_connect_redirect_access
check like hypothesized above.@expected
value from $_SESSION
when we show the "404 page" to the user was blank.[2017-04-24 18:20:48] production.DEBUG: created new auth code {"code":"dbd60d66ee7bd61b8b5c14586458ed4e651492250e63eaa1ae4344e45a763e5c33539077f9e5f1ce","user_id":"55a171c1469c64ef3d8b9c28","client_id":"phoenix-oauth"}
[2017-04-24 18:20:48] production.DEBUG: checked auth code {"code":"dbd60d66ee7bd61b8b5c14586458ed4e651492250e63eaa1ae4344e45a763e5c33539077f9e5f1ce","exists":true}
[2017-04-24 18:20:48] production.DEBUG: revoked auth code {"code":"dbd60d66ee7bd61b8b5c14586458ed4e651492250e63eaa1ae4344e45a763e5c33539077f9e5f1ce"}
That last part is interesting since it seems like we shouldn't ever get to the "revoke auth code" step if we've already failed in the openid_connect_redirect_access
access check for that page.
This makes me think we're somehow double-redirecting to this page... the first time we go through openid_connect_redirect_page
and unset the state and check/revoke the code, but then we somehow just redirect to the same page without the state value in the session.
Also not sure I'm reading this right, but there's a surprising number of entries for that code in Logstash&_a=(columns:!(_source),index:'filebeat-',interval:auto,query:(query_string:(analyze_wildcard:!t,query:%22%3Fcode%3DU91hsBpSrlLVvr4afvUl1pJu98K8xhUwSrun3Jz5tj4GhW2ESODPR6Pnp2skIBeixd0QRwv8aTAWPIEn8a2hGD12wvE%252FqJdUENdhr1L%252B6Mbwhjb7HNepHSMrFmrsA6ISt5fHwMIK7hcjVrS42ZiNZOVarpO1g9Lz7ObZOfNLKnAOfsi2kbvXtUNJYKXXOZQr95g9%252B%252BfIpg9grJz9tCB9P8M60FwF30tWU2cav45H22gtQzg0zWh2%252FCBvtYY5W56FdExgvgyl3u%252B7aGsMKq2FlBxpIPs2mQcZ%252F0%252FGfbbNwR4wDR5Yv%252Bu09pc4TOvl02HLbmSFrOuADGhMscGEqHVrAFhCe8RBbu2Iew77t4RPRu7QDzgXDTzLQj%252F%252BtBIxc5ha8mSDnx656VarUImuvwmWjDrXZH%252FTWIB9YA6CwcZy3pHKgiIjVgF%252B94TElNdserc0SzUTOt%252BrweBH4oQeHM%252Faj8hrFyDLBh0SjY2q%252FiEu%252FkfeQE9YMgxhDsU%252Bb4ZKgTveXxnOkzawhWGyOtN5%252BDdVtHuSBBD5Z6KwbD3E472JtgrPaF%252Bd%252F2laboxUbfrRyVTWEx0LRQGxAjn%252BZjxJyk2%252BX%252B1r5OdjshbIG5ycAkZJwDQFZ5Ghnn1RQqBenxu%252Bso8vo0vwOYWdfiaDLy4yUgkOEnLtbSRve8LDS2MZmeVa76s%22)),sort:!('@timestamp',desc))), whereas a test login I just made only shows 2 entries&_a=(columns:!(_source),index:'filebeat-',interval:auto,query:(query_string:(analyze_wildcard:!t,query:%22U91hsBpSrlLVvr4afvUl1pJu98K8xhUwSrun3Jz5tj4GhW2ESODPR6Pnp2skIBeixd0QRwv8aTAWPIEn8a2hGD12wvE%252FqJdUENdhr1L%252B6Mbwhjb7HNepHSMrFmrsA6ISt5fHwMIK7hcjVrS42ZiNZOVarpO1g9Lz7ObZOfNLKnBI3sgF9MhJ2imK8FNCk2buhum2ScntBwojHJUb24UiFquN8LB9IfhNWiTi1caxBdT7WF3FaxaLG2If4t8jqhfhMIjcmOLGIHqb8l0f6IqjGib%252FEIX1NdIF3IBpJPaJkRNBwAlQvIDha8DYDW%252B6yD%252B0YfyPMmbjAC%252FgeIvXxbM1BXE3er4OL9DhUq%252FayIDzB2k381bRVZOLtdwXLHOe2dGBskvZHeeAeJsddhqIR4ZVheN1WBqZFziW5gSTr0vI5PpyYs53XrHLaw22FtsAbltIESchHzPy6VQpOb9tx0zsRi3P9iiI6Wm%252FNaK2ufaYiOjFSZgob%252FHC%252FBaGTWXHDmw9XxnOkzawhWGyOtN5%252BDdVtHuSBBD5Z6KwbD3E472JtgrPaF%252Bd%252F2laboxUbfrRyVTWEx0LRQGxAjn%252BZjxJyk2%252BX%252B1r5OdjshbIG5ycAkZJwDQFZ5Ghnn1RQqBenxu%252Bso8vo0vwOYWdfiaDLy4yUgkOEnLtbSRve8LDS2MZmeVa76s%253D%22)),sort:!('@timestamp',desc))) (one of type nginx-access
and one syslog
).
Putting this down for now, but feels like we're getting closer...π΅οΈ
A ha! There's an error for that same user when updating their local profile fields: Data too long for column 'field_address_country' at row 1
. It turns out this user has the full string "United States"
saved in their profile, but the column in Drupal is only 2 characters long.
A few things we can do here:
I've checked out each of other accounts that submitted help tickets (1, 2, 3, 4, 5, 6) and all of them (except one!?) have entered longer text into their country field than that column would allow in Phoenix.
@hbghidey Both the Phoenix & Northstar code fixes have been deployed, so this should no longer be happening to anyone! Let me know if you receive any more reports. π
@DFurnes A beautiful day! Thanks for all of your work on this. I'll let the team know.
BUG
Current Behavior
Some (hopefully small) subset of users are seeing a 404 page when Northstar redirects them back to Phoenix as part of the authorization code flow. Here's an example of a URL that 404s:
For comparison, a working callback URL from my last login looks like this:
Desired Behavior
Users should be able to log in successfully, or a more useful error should be shown.
Steps to Replicate
@jessleenyc has been able to reliably replicate the problem when logging in to this intern's account on the production website: 584b130fa0bfad58c2231c02.
Why This Matters
Users can't log in, and are not provided with a useful error message.
Countries Affected (optional)
β¦
OS/Browser (optional)
β¦
Relevant Screenshots + Links