Seneca-CDOT / telescope

A tool for tracking blogs in orbit around Seneca's open source involvement
https://telescope.cdot.systems
BSD 2-Clause "Simplified" License
94 stars 187 forks source link

Improve error handling of auth callback #3567

Closed humphd closed 1 year ago

humphd commented 2 years ago

Sometimes when I try to sign in on production, I get this:

Screen Shot 2022-04-27 at 8 16 55 AM

I suspect my token has expired.

There's no value in giving a 400 here. We should redirect to do the full sign-in flow instead.

STR:

  1. Sign-up for your account on Telescope (prod)
  2. Sign-in
  3. Leave your browser tab open over night so that the token can expire (I'm not sure how long this will take)
  4. Come back to the tab and refresh, you should be logged out
  5. Click sign-in
humphd commented 2 years ago

The error happens here in the sso service: https://github.com/Seneca-CDOT/telescope/blob/adf299da6e6e9c3f20208ee6ab61d205a9f70eab/src/api/sso/src/routes/logout.js#L22

AmasiaNalbandian commented 2 years ago

Leave your browser tab open over night so that the token can expire (I'm not sure how long this will take)

I can confirm I finished the sign-up flow and immediately had an error. I saw my profile image pop up top right, and for a brief second I looked as if I was logged in, and then it signed me out. I thought this was because we require sign in after sign up, but when I went to login I reach the same error.

humphd commented 2 years ago

Looking at this a bit more:

We're failing here: https://github.com/Seneca-CDOT/telescope/blob/adf299da6e6e9c3f20208ee6ab61d205a9f70eab/src/api/sso/src/routes/login.js#L33-L36

Which somehow means we're not getting what we expect on the query string here: https://github.com/Seneca-CDOT/telescope/blob/e40693018b5471bd582c78d9889fa3261943d4b6/src/api/sso/src/middleware.js#L74-L77

That's weird, because we have middleware to validate it here: https://github.com/Seneca-CDOT/telescope/blob/e40693018b5471bd582c78d9889fa3261943d4b6/src/api/sso/src/middleware.js#L27-L50

So maybe we're losing the session?

We should be sending the redirect_uri from the front-end here: https://github.com/Seneca-CDOT/telescope/blob/d5b23e42523521f49fc77d1b76e3f408dd9a46d1/src/web/app/src/components/AuthProvider.tsx#L81

So I'm not clear which part of this is breaking down, but whoever wants to debug a bit, that's where I'd start.

humphd commented 1 year ago

cc @sfrunza13, this is what you were hitting too.

sfrunza13 commented 1 year ago

cc @sfrunza13, this is what you were hitting too.

It is, how would I go about debugging this?

humphd commented 1 year ago

The answer to "how do I debug this" always begins with, "how do I reproduce this?" If you can figure out a set of steps that reliably cause it, then we can start looking at why.

sfrunza13 commented 1 year ago

From the documentation of passport-SAML on NPM they claim that there should be a bodyparser middleware called before passport.authenticate('saml');

bodyParser.urlencoded({ extended: false }) which uses a different query string parser I think, maybe this makes a difference? It defaults to true and I think that is deprecated. I think I will try this changing this as well in my testing which I will try to do tomorrow.

sfrunza13 commented 1 year ago

I can not seem to replicate the issue with my local docker container setup using simplesaml storing session in either Redis or Memory.

humphd commented 1 year ago

We do this before we add the SAML stuff:

https://github.com/Seneca-CDOT/telescope/blob/master/src/satellite/src/app.js#L31-L34

sfrunza13 commented 1 year ago

We do this before we add the SAML stuff:

https://github.com/Seneca-CDOT/telescope/blob/master/src/satellite/src/app.js#L31-L34

Not sure if changing this could make a difference in prod or not, but there is an explicit point made of it in the docs to set extended to false. I don't think I have a way of testing it. Doesn't make a difference local in any case.

sfrunza13 commented 1 year ago

From the passport-saml docs

Provide the authentication callback You need to provide a route corresponding to the path configuration parameter given to the strategy:

The authentication callback must be invoked after the body-parser middlerware.

From the body-parser docs

extended The extended option allows to choose between parsing the URL-encoded data with the querystring library (when false) or the qs library (when true). The "extended" syntax allows for rich objects and arrays to be encoded into the URL-encoded format, allowing for a JSON-like experience with URL-encoded. For more information, please see the qs library.

Defaults to true, but using the default has been deprecated. Please research into the difference between qs and querystring and choose the appropriate setting.

Really unsure if it makes a difference 😅

humphd commented 1 year ago

Yeah, I'm just reading the same docs, and also looking at https://github.com/node-saml/passport-saml/blob/a5719e9f72a4f6154f3e2aef17c36b4a4efceaec/README.md?plain=1#L211. It's really about which parser it uses, which I suspect is not the cause of this bug.

sfrunza13 commented 1 year ago

It's definitely not likely but it's one of the only things I could think of, now not even being able to re-create the problem anymore I really am lost.

humphd commented 1 year ago

You've done well. I'm lost on it, too. Let's keep this open and ponder. As we build things in Starchart, we'll learn more and hopefully get to the bottom of things.

sfrunza13 commented 1 year ago

connect-sid missing in cookies on the req header of the callback when we get the 400