zwave-js / zwave-js-ui

Full featured Z-Wave Control Panel UI and MQTT gateway. Built using Nodejs, and Vue/Vuetify
https://zwave-js.github.io/zwave-js-ui
MIT License
930 stars 197 forks source link

ZWaveJS fails to load when running with external auth due to service worker #3427

Closed ajacques closed 6 months ago

ajacques commented 8 months ago

Checklist

Deploy method

Docker

Z-Wave JS UI version

9.3.2

ZwaveJS version

12.3.0

Describe the bug

I'm using an external auth provider (nginx + vouch proxy) to protect all of my services with one SSO solution. I choose to do this instead of having separate username and passwords for different services. Effectively how it works is it checks for a cookie on all requests to ZWaveJS and redirects the browser to the login page.

The problem is that ZWaveJS now appears to use a service worker and works as a PWA (I suspect this came in with this commit d409051b9e96939fa0510399e45693b331a34cf1) which intercepts the call to / and causes this to fail. When the authorization expires, I have to hard refresh the page to load it.

What happens is the / page loads from the Service worker, then the GET /api/auth-enabled call gets a 302 Found to redirect to the login page, the app doesn't know what to do and fails to load the app. The / request does not hit the backend app (I've checked my logs), so it can't know that / is failing. The only way to bypass this is to hard refresh the page, but that's not easy on mobile phones.

To Reproduce

  1. Install ZWaveJS behind some kind of reverse proxy such as nginx
  2. On the reverse proxy, enable an auth proxy that intercepts requests, such as vouch proxy
  3. Try to load the ZWaveJS and see that it gives a "Network Error. Retry" and the page fails
  4. Hard refresh the page (ctrl-shift-r) and notice the app loads because it bypasses the service worker for GET / and the app will work until the auth token expires.

Expected behavior

I expect the ZWaveJS application to be able to handle when there's a separate auth provider and redirect to the login page. ZWaveJS does not have to specifically have to support these providers, just handle the redirects.

I see two options here:

  1. When the service worker receives the GET / request, it revalidates with the service to see if the response includes a redirect or not. This can also be used to refresh the cache in the service worker. (basically cache-bust the / call because the app can't work without internet anyway)
  2. When the app sees a 302 redirect for GET /api/auth-enabled, it follows the redirect in a full frame redirect (i.e. browser tab navigates to the new page.) This appears to be the first request that actually gets forwarded to the server so would have to be the first point the app knows there's external auth.

It seems like the service worker comes from Vite, which possibly internalizes the cache handling logic. I'm unfamiliar with what knobs it has to know whether 1 would work, but 2 should be doable in the app side. I tried changing the Cache-Control on GET / to be must-revalidate, but the service worker does not respect it

Additional context

No response

robertsLando commented 8 months ago

It seems like the service worker comes from Vite, which possibly internalizes the cache handling logic.

Yeah it does, maybe there is an option in vite-pwa docs for that? Could you try to play with that? It's hard for me to test this as I don't have your setup

ajacques commented 8 months ago

I'll keep investigating to see what I can figure out. I wanted to cut this issue just in case you were more familiar with how the PWA/service-worker itself worked and had some ideas. So far I've tried changing the Cache-Control header on /, but no success yet. I'll have to dig in the code. Somebody else was asking about this style of auth in this issue.

robertsLando commented 8 months ago

I wanted to cut this issue just in case you were more familiar with how the PWA/service-worker itself worked and had some ideas

Unfortunately I'm not, I mean I never had the need to dig on such things, I only know that service workers act as proxy to cache requests and make your page load faster and allows to handle background tasks like push notifications and so on (at least this is what I used them for since now).

Some useful readings can be found here:

IMO what should be done here is to switch to injectManifest in order to be able to provide a custom service worker code where we you can intercept requests like here: https://medium.com/@alekswebnet/setup-token-based-authentication-for-media-files-with-service-workers-and-workbox-e8674fa621f

krissfr commented 8 months ago

I have the same issue with another reverse proxy (CaddyV2). unfortunatly i can't help, because i don't have the skill of knowledge required.

ajacques commented 8 months ago

I haven't found the time to dig too much into this yet due to the holidays and traveling. Hoping to find the time to get a dev workspace setup and try out a fix. My current short-term workaround is to just intercept /registerSW.js and return a 404. This prevents the service worker from registering, though it would break push notifications if those are being used.

robertsLando commented 8 months ago

This prevents the service worker from registering, though it would break push notifications if those are being used.

Them are not used, Service worker is just used to cache requests and resources

krissfr commented 7 months ago

After delete all cookies related to the external url, all is working fine for me with ZwaveJS UI 9.5.1

ajacques commented 7 months ago

Deleting the cookies will temporarily fix the issue until the session expires and you're back in the same place. The lifetime will differ depending on your auth provider.

robertsLando commented 6 months ago

I have just created a possible fix for this. Anyone here is able to give a try to #3519 ?

In order to make it easier for you to test it I have created https://github.com/zwave-js/zwave-js-ui/actions/runs/7529788437

Once that ends you will be able to test it using test tag on docker

Daniel-dev22 commented 6 months ago

I have just created a possible fix for this. Anyone here is able to give a try to #3519 ?

In order to make it easier for you to test it I have created https://github.com/zwave-js/zwave-js-ui/actions/runs/7529788437

Once that ends you will be able to test it using test tag on docker

Will test this out and update you!

Daniel-dev22 commented 6 months ago

I have just created a possible fix for this. Anyone here is able to give a try to #3519 ?

In order to make it easier for you to test it I have created https://github.com/zwave-js/zwave-js-ui/actions/runs/7529788437

Once that ends you will be able to test it using test tag on docker

Unfortunately even with that tag and clearing cache I am still getting 401. It's not redirecting.

robertsLando commented 6 months ago

@Daniel-dev22 Please click on the top right info icon and paste the version here so I'm sure you are using the correct one.

Also I suggest others to try this out as the issue may be different from the one from @Daniel-dev22

robertsLando commented 6 months ago

Oh I just noticed that my glob was wrong, I have triggered another build: https://github.com/zwave-js/zwave-js-ui/actions/runs/7530153608 (usually takes 7minutes).

Please pull the new version when ready and let me know

Daniel-dev22 commented 6 months ago

@Daniel-dev22 Please click on the top right info icon and paste the version here so I'm sure you are using the correct one.

Also I suggest others to try this out as the issue may be different from the one from @Daniel-dev22

zwave-js-ui: 9.7.0.58f2a2a zwave-js: 12.4.1 home id: id home hex: hex

robertsLando commented 6 months ago

@Daniel-dev22 Ok that's correct, just wait a moment and test the new one (ensure that after the update the version is 9.7.0.9551b45 (last 7 chars are the short commit sha)

robertsLando commented 6 months ago

I also suggest to try doing an hard refresh of the page just to be sure the service worker has been updated (Ctrl+F5 on Chrome)

Daniel-dev22 commented 6 months ago

I also suggest to try doing an hard refresh of the page just to be sure the service worker has been updated (Ctrl+F5 on Chrome)

Once I cleared cookies it's working both in the browser and with the "install app" pwa on Android.

robertsLando commented 6 months ago

@Daniel-dev22 ok see if the issue appears again in next days or if it's solved now

Daniel-dev22 commented 6 months ago

@Daniel-dev22 ok see if the issue appears again in next days or if it's solved now

Will report back on Wednesday.

ajacques commented 6 months ago

Thanks for trying to fix this. I did a bit of digging into the docs, but still haven't figured out the right direction. I deployed the :test branch, but it did not fix the problem for me.

The problem is that the GET / request does hit the service worker, which returns the cached HTML page. The JS/CSS all load from SW cache too. Then the JS calls GET /api/auth-enabled, which returns the following response. It doesn't matter if the SW itself processes the GET /api/auth-enabled or not.

HTTP/2 302 
date: Mon, 15 Jan 2024 18:10:44 GMT
content-type: text/html
content-length: 138
location: https://login.internal.example.com/login?url=https://zwave.internal.example.com/api/auth-enabled&vouch-failcount=&X-Vouch-Token=&error=&rd=https://zwave.internal.example.com%2Fapi%2Fauth-enabled
X-Firefox-Spdy: h2

This causes the JS to automatically try to follow the redirect which triggers an OPTIONS request and it fails CORS. I tried enabling CORS on this origin (login.internal.example.com). The JS was able to follow the direct, but then my provider redirected to my SSO provider, github.com, and failed CORS there.

I tried experimenting with the following to see if it was even possible to follow the redirects to get the cookies set ignoring the HTML framing problem, but this caused the request to my provider to be sent without cookies. I suspect it's because of the ongoing browser changes to block/isolate third-party cookies.

fetch('/auth-enabled', { mode: 'no-cors', credentials: 'include', redirect: 'follow' })

I work on an auth provider integration at my day job, and the two options we used were redirecting the entire page or opening up a pop-up window with the auth provider and closing it later.

This leads me to believe one of two things needs to happen:

  1. The GET / request must get proxied through to the upstream. This would reduce some of the benefits of the service worker caching, but it would be the easiest
  2. The client code here should catch the exception, see there's a redirect, trigger a full page navigation. Unfortunately Axios doesn't really give a great error messag to differentiate , just Network Error.
  3. Disabling the service worker. This is my current temporary strategy. I created an ingress that returned a 404 for the GET /registerSW.js request.

Just curious, what is the goal of the service worker? I assume you added it for a reason, so I won't recommend doing (3) in the actual code base, but I would presume most users are accessing it locally.

I also suggest to try doing an hard refresh of the page just to be sure the service worker has been updated (Ctrl+F5 on Chrome)

Once I cleared cookies it's working both in the browser and with the "install app" pwa on Android.

@Daniel-dev22 which external auth provider are you using and how is it hooked into your reverse proxy?

Daniel-dev22 commented 6 months ago

@Daniel-dev22 which external auth provider are you using and how is it hooked into your reverse proxy?

@ajacques I'm using traefik with authelia as the auth provider.

Traefik handles it with https://doc.traefik.io/traefik/middlewares/http/forwardauth/

ajacques commented 6 months ago

Is Authelia delegating to another IdP or does it internally have usernames and passwords?

What happens if you do the following:

  1. Open ZWave JS once and confirm it loads correctly
  2. Open the browser dev tools
  3. Delete any cookies set for this origin, but don't clear the cache
  4. Switch to the Network tab
  5. Reload the page
  6. In the Network tab there should be the GET request for zwave.example.com/ which gets returned from the service worker and there's the /auth-enabled. What do the response headers look like? Is there a redirect or something that causes Authelia to authenticate you?
Daniel-dev22 commented 6 months ago

Is Authelia delegating to another IdP or does it internally have usernames and passwords?

What happens if you do the following:

  1. Open ZWave JS once and confirm it loads correctly
  2. Open the browser dev tools
  3. Delete any cookies set for this origin, but don't clear the cache
  4. Switch to the Network tab
  5. Reload the page
  6. In the Network tab there should be the GET request for zwave.example.com/ which gets returned from the service worker and there's the /auth-enabled. What do the response headers look like? Is there a redirect or something that causes Authelia to authenticate you?

Authelia has the username and passwords.

It sets a session cookie and continuously checks if it's still present and valid.

There is a redirect.

http://autheliaip/api/verify?rd=https://authelia.mydomain.net

/api/verify returns 401 when unauthorized which can happen when the session times out.

From traefik forward auth documentation

If the service answers with a 2XX code, access is granted, and the original request is performed. Otherwise, the response from the authentication server is returned.
ajacques commented 6 months ago

I also suggest to try doing an hard refresh of the page just to be sure the service worker has been updated (Ctrl+F5 on Chrome)

Once I cleared cookies it's working both in the browser and with the "install app" pwa on Android.

I was asking because it sounds like you'd have the same problem as me, but this response implies that the above PR appears to fix your problem. That redirect should involve an OPTIONS request which requires CORS. Now maybe Authelia handled CORS correctly, but if you had to type in a password because your Authelia session expired then that redirect would presumably fail.

Daniel-dev22 commented 6 months ago

I also suggest to try doing an hard refresh of the page just to be sure the service worker has been updated (Ctrl+F5 on Chrome)

Once I cleared cookies it's working both in the browser and with the "install app" pwa on Android.

I was asking because it sounds like you'd have the same problem as me, but this response implies that the above PR appears to fix your problem. That redirect should involve an OPTIONS request which requires CORS. Now maybe Authelia handled CORS correctly, but if you had to type in a password because your Authelia session expired then that redirect would presumably fail.

I am able to authenticate from the redirect to authelia and then be redirected back to zwave.

What's interesting is once I'm authenticated within zwave.mydomain.net if I go to the authelia page and sign out and go back to zwave.mydomain.net I get a 401 no redirect happens. If I clear cookies then the redirect happens.

So it only works to authenticate when cookies are cleared but not re-authenticate? I'm not on desktop currently to look at the network tab in developer tools when this happens but sometimes I noticed testing this on mobile.

ajacques commented 6 months ago

So this fix will only work if:

The reason it doesn't work after you logout of Authelia is because you need to show some HTML to type in your password. Thus the only way this will work is if window.location changes to your IdP.

Daniel-dev22 commented 6 months ago

So this fix will only work if:

  • Your Auth provider(s) enables CORS
  • Your browser allows third party cookies (without origin isolation) in your browser, but Chrome is deprecating them (supposedly)
  • and you don't need to re-authenticate to your IDP

The reason it doesn't work after you logout of Authelia is because you need to show some HTML to type in your password. Thus the only way this will work is if window.location changes to your IdP.

Traefik does cors.

        accesscontrolalloworiginlist: 
          - https://domain.net
          - https://authelia.domain.net
          - https://traefik.domain.net

Should I zwave.domain.net?

ajacques commented 6 months ago

Should I [enable CORS on] zwave.domain.net?

No, there are no cross origin requests to zwave.domain.net. The cross origin request is to authelia.domain.net which is why it looks like it's partially working right now.

EDIT: actually, I may be confused. I don't think adding CORS will fix the problem because you still have to deal with the third-party cookie problem and wouldn't be able to handle reauthentications if you sign out of Authelia.

The only way to fully fix this issue is to do one of these:

  • The GET / request must get proxied through to the upstream. This would reduce some of the benefits of the service worker caching, but it would be the easiest

  • The client code here should catch the exception, see there's a redirect, trigger a full page navigation. Unfortunately Axios doesn't really give a great error message to differentiate , just Network Error. This is additionally tricky because the redirect URL will be for /api/auth-enabled which will break.

  • Disabling the service worker. This is my current temporary strategy. I created an ingress that returned a 404 for the GET /registerSW.js request.

But thus far I haven't figured out how to actually make that happen.

robertsLando commented 6 months ago

Firstly thanks @ajacques for all your detailed informations, much appreciated 🙏🏼

  1. The GET / request must get proxied through to the upstream. This would reduce some of the benefits of the service worker caching, but it would be the easiest
  2. The client code here should catch the exception, see there's a redirect, trigger a full page navigation. Unfortunately Axios doesn't really give a great error messag to differentiate , just Network Error.

I could try doing the point 2 and simply trigger a reload in case of an error on that call? At least I could try this approach in that PR to see if that fixes the problem. I'm not sure why the point 1 is needed

About why I enabled PWA, the reason is to make it load faster on mobile phones (thanks to the cache), also it's not hard to maintain as everything is managed internally by the VitePWA plugin. In future that would also allow me to send push notifications if we ever find a reason to add this feature

robertsLando commented 6 months ago

I have just implemented point 2. and now the app should reload the page when it detects a redirect in response code. Please give the test tag a new try and let me know

Daniel-dev22 commented 6 months ago

I have just implemented point 2. and now the app should reload the page when it detects a redirect in response code. Please give the test tag a new try and let me know

If this is the right version I'm still having an issue where it works only after clearing cookies but once the session expires it doesn't reload properly it just returns 401

zwave-js-ui: 9.7.1.bdca841
zwave-js: 12.4.1
home id: id
home hex: hex
robertsLando commented 6 months ago

Ok let me add the 401 code so

robertsLando commented 6 months ago

New build triggered: https://github.com/zwave-js/zwave-js-ui/actions/runs/7572008099

Daniel-dev22 commented 6 months ago

Ok let me add the 401 code so

Weird that triggers what appears to be a reload loop.

https://github.com/zwave-js/zwave-js-ui/assets/47092714/1b88d790-1cee-4a29-89fb-5abdc72276b0

robertsLando commented 6 months ago

I'm out of ideas so... :( dunno if the response has some headers I could use to redirect to a new page. Btw I cannot clear cookies programmatically AFAIK

Daniel-dev22 commented 6 months ago

I'm out of ideas so... :( dunno if the response has some headers I could use to redirect to a new page. Btw I cannot clear cookies programmatically AFAIK

Going to check later today to see what's happening in the network tab/headers.

ajacques commented 6 months ago

Ok I think I know what's going on. When you do location.reload() it reloads the page, but the GET / still hits the service worker cache, thus the reload is pointless. Additionally, I don't see the reload loop because I get a CORS failure which doesn't surface as a 302 or a 401. Axios has a default redirect count limit of 5, but instead we don't want to follow any redirects and we should set the limit to 0 to identify the first redirection. Weirdly, Axios claims it's not possible to ignore redirects due to browser limitations, however the fetch API does provide a redirect property

But we still need to bust the cache for /. One idea would be to look at the Location header, then set window.location = response.headers['Location']. That'll trigger the auth login, but when we end up back at ZWave JS, we'll be looking at /api/auth-enabled. Presumably that'll fail to load. That endpoint could look for Accept: text/html and redirect, but that's very messy.

Another option, is to catch that 302 redirect, then do location.search = '?auth=${Math.random()}'. That busts the SW cache and gets us back into ZWaveJS UI successfully. This has worked in my current testing and it should handle all types of auth providers including delegating ones such as mine. I don't have a working change because I need to figure out how to handle this Axios issue, but this seems to be the least invasive approach I've found.

robertsLando commented 6 months ago

Another option, is to catch that 302 redirect, then do location.search = '?auth=${Math.random()}'. That busts the SW cache and gets us back into ZWaveJS UI successfully

Let me try this

robertsLando commented 6 months ago

Wait for this and let me know: https://github.com/zwave-js/zwave-js-ui/actions/runs/7581339064

Daniel-dev22 commented 6 months ago

Wait for this and let me know: https://github.com/zwave-js/zwave-js-ui/actions/runs/7581339064

That's working for me now! Awesome.

robertsLando commented 6 months ago

@Daniel-dev22 Thanks for your feedback!

@ajacques what about you?

ajacques commented 6 months ago

Almost. I get a CORS failure instead of a 302 which doesn't trigger the error handler, so it doesn't have a chance to trigger the reload. Any auth provider that delegates to another identity provider like Google or GitHub will hit this issue. I can enable CORS on my provider, but can't enable it on Google or GitHub. The browser doesn't seem to give a good error message when it encounters this case. It doesn't give a 301/401 as the code is currently checking for. I'd really like to identify the 301/302 on the first redirect to know for sure it's a redirect, which would mean we need to do maxRedirects: 0, then check to see if the response code is 302 on the first redirect as opposed to following redirects, but this doesn't work in Axios because it uses the XHR API which can't block redirects.

This is what I came up with: https://github.com/ajacques/zwave-js-ui/commit/9f5edb1ca2c29f37ced56ed4a8d48ea8d7728c37 I don't like having this inconsistent usage of fetch and axios though.

robertsLando commented 6 months ago

@ajacques I see there is a work in progress but no clue when/if it will ever be merged: https://github.com/axios/axios/pull/5146

Anyway I checked your solution and yeah I agree it's not the best to use a mix of axios and fetch but I think it's worth to accept that as it fixes your issue, maybe I could also completely drop axios

krissfr commented 6 months ago

it's ok for me with my reverse proxy (Caddy V2)

robertsLando commented 6 months ago

Try with this once it ends: https://github.com/zwave-js/zwave-js-ui/actions/runs/7610315144 to see if it works even for you @krissfr and @ajacques

Once confirmed I will merge it

raman325 commented 6 months ago

@robertsLando I am using Caddy as a reverse proxy with Authentik for forward auth. I tried the image tagged test and I am still getting a CORS error: /#/:1 Access to XMLHttpRequest at 'https://auth.example.com/application/o/authorize/?client_id=<CLIENT_ID>&redirect_uri=https%3A%2F%2Fzjsui.example.com%2Foutpost.goauthentik.io%2Fcallback%3FX-authentik-auth-callback%3Dtrue&response_type=code&scope=profile+email+openid+ak_proxy&state=<STATE>' (redirected from 'https://zjsui.example.com/api/auth-enabled') from origin 'https://zjsui.example.com' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.

ajacques commented 6 months ago

@robertsLando zwave-js-ui: 9.7.1.31e37ce works for me

@raman325 Are you running 9.7.1.31e37ce? What browser is this? Did you hard-refresh to load the latest version of the code before testing? Can you screenshot the network tab or share the request/responses (minus any sensitive info)? The latest commit should not be following any redirects, so there shouldn't be a CORS error.

raman325 commented 6 months ago

must not have done a hard refresh, it appears to be working now. Versions as follows: zwave-js-ui: 9.7.1.31e37ce zwave-js: 12.4.1 Google Chrome Version 120.0.6099.234 (Official Build) (arm64)

robertsLando commented 6 months ago

Ok seems we are done now so 🎉 Thanks so much @ajacques for the tips and @raman325 @krissfr for the feedbacks!

ajacques commented 6 months ago

Awesome! Thanks for fixing this issue!