smallstep / cli

🧰 A zero trust swiss army knife for working with X509, OAuth, JWT, OATH OTP, etc.
https://smallstep.com/cli
Apache License 2.0
3.65k stars 252 forks source link

[Bug]: Windows Chrome version 123.0.6312.59 (Official Build) (64-bit) incompatibility #1131

Closed Nogal closed 6 months ago

Nogal commented 6 months ago

Steps to Reproduce

step ssh login using an OIDC provisioner, login through your SSO provider successfully, then timeout.

everything works until the callback to the client (in our case port 10000) where it hangs and the client produces the error below

Your Environment

Expected Behavior

nogal@madison:~$ step ssh login --provisioner=My-OIDC-Provider-Name-Was-Here 
✔ Provisioner: My-OIDC-Provider-Name-Was-Here (OIDC) [client: ABCD1234]
Your default web browser has been opened to visit:

https://my-regular-super-secret-sso-stuff-here

✔ CA: https://step.example.com
✔ SSH Agent: yes
nogal@madison:~$ 

Actual Behavior

✔ Provisioner: My-OIDC-Provider-Name-Was-Here (OIDC) [client: ABCD1234]
Your default web browser has been opened to visit:

https://my-regular-super-secret-sso-stuff-here

2024/03/21 08:41:12 httptest.Server blocked in Close after 5 seconds, waiting for connections:
  *net.TCPConn 0xc00007c960 127.0.0.1:50112 in state active

Additional Context

This appears to be isolated to the latest Chrome release. Using Edge or Firefox the flow is still working properly. We had one user report the same behavior using the "Brave" browser which is also chromium based, but we're unsure of the details.

We were not able to reproduce this issue on Linux

Contributing

Vote on this issue by adding a 👍 reaction. To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

hslatman commented 6 months ago

Hey @Nogal,

I have not been able to reproduce this on Windows yet, although I have to add that I'm using a slightly different binary compared to yours. It's the same version, but reports as a 32 bits emulated version, so it might not have the exact same behavior.

It's possible that Chrome made changes to the behavior for interacting on/with private network segments. I've seen an issue report in a different project for a similar use case, specifically on Chrome (and Edge) v123. They suggested setting chrome://flags/#block-insecure-private-network-requests to disabled as a workaround. It might be worthwhile trying that if that's an option for you.

The error that is reported seems to indicate there's an active connection, and thus closing the local server hosting the callback endpoint failed. It's likely that is the connection from Chrome, but it's interesting that it doesn't seem to be closed on v123, whereas it is on older version and different platforms, based on your remark that it seems isolated to this version and Windows.

Can you try running the command with GODEBUG set to http2debug=2 as an environment variable? And if it again stops in this state, and the program does not exit by itself, can you try sending it a SIGQUIT signal? Not sure if the latter will help at this time, but the debugging output should help with assessing if/what requests succeed and which do not.

dchen496 commented 6 months ago

Hi @hslatman

I'm having the same issue on Chrome v123 on MacOS. Setting chrome://flags/#block-insecure-private-network-requests to disabled as you suggested fixed the problem for me for now. I'd try to send more logs, but oddly when I change the setting to enabled it still works.

Nogal commented 6 months ago

@hslatman we can also report that a few of our windows users have attempted the setting-change you suggested and have reported that it worked around the issue.

Most users have switched to firefox and it would be... an effort... to say the least to get them to switch back.... but we reached out to our more technical users and it does appear to workaround the issue for them.

This all being said, "insecure" private network requests seems a little "off" to me. Would that indicate that something's not 100% correct in the CA' certificate trust being passed into the schannel used by chrome on windows?

Looking into mine i see it's providing a full chain w/out any issues so it doesn't look like in issue with the http config itself... and apart from our endpoint being on a standard RFC 1918 local IP address range, there's really nothing special indicating this connection as "local" -- it's going through our regular BIND servers.

hslatman commented 6 months ago

@Nogal it's (most likely) not the connection to the CA that is failing, but it's the redirect (callback) for the OAuth2 flow that the browser is instructed to perform towards http://127.0.0.1:10000 (hosted by the CLI) that is not (fully) completed.

It's possible that the flow does in fact complete, but the CLI then fails to stop the HTTP server it serves on port 10000 because the connection from the browser is still open, and thus fails completely. Haven't been able to reproduce yet, but I think I can try some more sometime later today.

hslatman commented 6 months ago

Hi @hslatman

I'm having the same issue on Chrome v123 on MacOS. Setting chrome://flags/#block-insecure-private-network-requests to disabled as you suggested fixed the problem for me for now. I'd try to send more logs, but oddly when I change the setting to enabled it still works.

Changing it back to enabled and it still working as expected sounds like some state w.r.t. "safe connections" is being kept around. I suspect incognito mode might retrigger the behavior.

hslatman commented 6 months ago

@Nogal, @dchen496,

I have a potential fix in the works: https://github.com/smallstep/cli/pull/1136. One of my colleagues was able to reproduce, and a build from that branch seems to fix the issue. It would be great if you could verify.

hslatman commented 6 months ago

@Nogal, @dchen496

I've tagged an RC: https://github.com/smallstep/cli/releases/tag/v0.25.3-rc3 and the builds are available now.

We'll likely release our v0.26.0 soon. This should help in the meantime.

Nogal commented 6 months ago

@hslatman i can confirm my users are reporting that the rc3 build does appear to fix the issue.

hslatman commented 6 months ago

v0.26.0 with the fix is out: https://github.com/smallstep/cli/releases/tag/v0.26.0