bjowes / cypress-ntlm-auth

Windows authentication plugin for Cypress
MIT License
55 stars 10 forks source link

When used behind a corporate proxy, cypress-ntlm-auth prevents the upload of the Test Replay archive to the Cypress Dashboard #244

Closed gabi-dobritescu closed 5 months ago

gabi-dobritescu commented 10 months ago

Issue: When used behind a corporate proxy, cypress-ntlm-auth prevents the upload of the Test Replay archive to the Cypress Dashboard.

Details: When recording results to the Cypress Dashboard(https://cloud.cypress.io/) the cypress-ntlm runner prevents the upload of the Test Replay archive, while the normal cypress runner works just fine.

Running the command: npx cypress-ntlm run --record --key <project_key> results in errors while uploading the test replay archive.

Running the command: npx cypress run --record --key <project_key> works fine successfully uploading the test replay archive to the dashboard.

Important to note that uploading screenshots or the video recording of the test run succeeds when using the cypress-ntlm runner. It's just the upload of the test replay archive that fails.

Also, uploading the test replay archive succeeds when using the cypress-ntlm runner without a corporate proxy.

Further details: I've tried a number of different setups.

A) No additional configuration When doing a test run with the cypress-ntlm runner without any additional configuration the error message looks like this: - Test Replay - Failed Uploading 2/2 - request to[ https://capture.cypress.io/upload/a9dsjw/70d6724a-2383-47a8-9ff7-83bbe415e6a2/849f735b-2a51-4c44-b472-36c0a5f6918e.tar?AWSAccessKeyId=AKIAIGH7VO3KJJU4LBGQ&Content-Type=application%2Fx-tar&Expires=1700144017&Signature=5N2n%2BJ82hay0g5FXfmIMKGIAlEU%3D](https://capture.cypress.io/upload/a9dsjw/70d6724a-2383-47a8-9ff7-83bbe415e6a2/849f735b-2a51-4c44-b472-36c0a5f6918e.tar?AWSAccessKeyId=AKIAIGH7VO3KJJU4LBGQ&Content-Type=application%2Fx-tar&Expires=1700144017&Signature=5N2n%2BJ82hay0g5FXfmIMKGIAlEU%3D) failed, reason: Client network socket disconnected before secure TLS connection was established

This debug message is also logged to the console: cypress-ntlm-auth: Certificate validation failed for "capture.cypress.io". ECONNRESET

B) With NODE_TLS_REJECT_UNAUTHORIZED=0 Doing a run with the environment variable _NODE_TLS_REJECTUNAUTHORIZED=0 set results in the same error, but the warning message about the certificate is not shown anymore.

- Test Replay - Failed Uploading 2/2 - request to[ https://capture.cypress.io/upload/a9dsjw/70d6724a-2383-47a8-9ff7-83bbe415e6a2/849f735b-2a51-4c44-b472-36c0a5f6918e.tar?AWSAccessKeyId=AKIAIGH7VO3KJJU4LBGQ&Content-Type=application%2Fx-tar&Expires=1700144017&Signature=5N2n%2BJ82hay0g5FXfmIMKGIAlEU%3D](https://capture.cypress.io/upload/a9dsjw/70d6724a-2383-47a8-9ff7-83bbe415e6a2/849f735b-2a51-4c44-b472-36c0a5f6918e.tar?AWSAccessKeyId=AKIAIGH7VO3KJJU4LBGQ&Content-Type=application%2Fx-tar&Expires=1700144017&Signature=5N2n%2BJ82hay0g5FXfmIMKGIAlEU%3D) failed, reason: Client network socket disconnected before secure TLS connection was established

C) With HTTP_PROXY If in addition to the _NODE_TLS_REJECTUNAUTHORIZED environment variable I also set the _HTTPPROXY environment variable to the url of our corporate proxy, then the test runs fails with the following error messages:

We encountered an unexpected error communicating with our servers.
RequestError: Error: unable to verify the first certificate

We will retry 3 more times in 30 seconds...

We encountered an unexpected error communicating with our servers.
RequestError: Error: unable to verify the first certificate

We will retry 2 more times in 1 minute...

We encountered an unexpected error communicating with our servers.
RequestError: Error: unable to verify the first certificate

We will retry 1 more time in 2 minutes...

We encountered an unexpected error communicating with our servers.
RequestError: Error: unable to verify the first certificate

Cypress could not execute tests

Could not find Cypress test run results

I've tried exporting the cypress.io certificate and setting the _NODE_EXTRA_CACERTS environment variable, but that did not change the results. But I'm not confident that I've done everything correctly.

Important to note that in all 3 scenarios using the cypress runner successfully uploads the test replay archive to the Cypress Dashboard.

Environment: OS: Win10 Node: 16.13.0 Latest version for cypress and cypress-ntlm-auth at the time of this writing.

Steps to reproduce: 1) Clone the test repo here: https://github.com/gabi-dobritescu/cypress-test-replay 2) npm install 3) Link the project to the Cypress Dashboard (You can get a free account for the dashboard. Let me know if you need help with this step) 4) npx cypress-ntlm --record --key

I've attached the debug output for the A) scenario. Is the output from using set DEBUG=cypress:*. Let me know if you'd like to narrow the logs down(they are quite noisy). cypress-test-replay-debug-logs.txt

Happy to provide any additional details or help with the investigation.

Many thanks.

bjowes commented 10 months ago

Hi @gabi-dobritescu, First, thanks for the very detailed report!

I haven't tried running this locally yet, some questions first:

gabi-dobritescu commented 10 months ago

I take your point about the HTTP_PROXY/HTTPS_PROXY environment variables being necessary.

I've done a run with the normal Cypress runner and from the logs, it looks like Cypress is picking up the proxy URLs from the npm config files:

cypress:server:util:proxy found proxy environment variables { npm_config_proxy: '<redacted>', npm_config_https_proxy: '<redacted>', npm_config_noproxy: '' } +0ms
cypress:server:util:proxy using npm's npm_config_proxy as HTTP_PROXY +1ms
cypress:server:util:proxy using npm's npm_config_https_proxy as HTTPS_PROXY +0ms
cypress:server:util:proxy setting default NO_PROXY of `` +0ms
cypress:server:util:proxy <-loopback> not found, adding localhost to NO_PROXY +0ms
cypress:server:util:proxy normalized proxy environment variables { NO_PROXY: '127.0.0.1,::1,localhost', HTTP_PROXY: '<redacted>', HTTPS_PROXY: '<redacted>' } +0ms

This would explain how the cypress runner successfully connects to the cypress.io.

Still, I'm not sure how the cypress-ntlm runner manages to connect to cypress.io without the HTTP_PROXY environment variables. But it surely does.

gabi-dobritescu commented 10 months ago

I've also done another run setting all the proxy related environment variables:

set NO_PROXY=127.0.0.1,localhost
set HTTP_PROXY=<redacted>
set HTTPS_PROXY=<redacted>
set NODE_TLS_REJECT_UNAUTHORIZED=0

The results of the run are identical to scenario C) from my initial report. I've attached the logs from this run as well: debug-logs-with-proxy-settings.txt

Again, executing the tests with the normal cypress runner completes successfully.

gabi-dobritescu commented 10 months ago

So for me the best results I get are with the setup described in scenarios A) and B). In those setups the tests complete successfully and at least the video recording is being uploaded to the Cypress Dashboard.

When I set the HTTP_PROXY environment variables the tests don't execute at all.

If you look through the log file I attached to the initial report you'll see this error message: cypress:server:record failed to upload artifact { file: undefined, url: 'https://capture.cypress.io/upload/<redacted>/bd4ecfae-c746-459c-94e7-52f7616fa485/26ed5892-676f-43fd-b5f3-83c5acbfa335.tar?AWSAccessKeyId=AKIAIGH7VO3KJJU4LBGQ&Content-Type=application%2Fx-tar&Expires=1700149947&Signature=Jzji0XsUPwAzSsuxoLF6bKdjScc%3D', stack: 'AggregateError: request to https://capture.cypress.io/upload/<redacted>/bd4ecfae-c746-459c-94e7-52f7616fa485/26ed5892-676f-43fd-b5f3-83c5acbfa335.tar?AWSAccessKeyId=AKIAIGH7VO3KJJU4LBGQ&Content-Type=application%2Fx-tar&Expires=1700149947&Signature=Jzji0XsUPwAzSsuxoLF6bKdjScc%3D failed, reason: Client network socket disconnected before secure TLS connection was established\n' + ' at r (<embedded>:4436:56560)\n' + ' at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n' + ' at async r (<embedded>:4436:56515)\n' + ' at async r (<embedded>:4436:56515)\n' + ' at async r (<embedded>:4436:56515)\n' + ' at async r (<embedded>:4436:56515)\n' + ' at async r (<embedded>:4436:56515)\n' + ' at async r (<embedded>:4436:56515)\n' + ' at async r (<embedded>:4436:56515)\n' + ' at async A.uploadCaptureArtifact (<embedded>:4436:56605)\n' + ' at async <embedded>:4526:96292' } +1m

The error message seems to reference this Cypress method: uploadCaptureArtifact From what I can tell this is the method responsible for handling the upload of the test replay archive.

The error message Client network socket disconnected before secure TLS connection was established seems to be common when interacting with AWS Lambdas. The presence of the "AWSAccessKeyId" query param in the url makes me think that Cypres might be using AWS Lambdas to store these archives in the cloud.

If that is the case, it seems this error might be caused by an unhandled promise. See this comment on a AWS_Lambda issue: unhandled promised

I'm not sure how the cypress-ntlm-auth library interfaces with the uploadCaptureArtifact method, but from my debugging and investigation this seems like a likely place for where the issue is coming from.

I hope I'm not setting you on a wild goose chase 😄.

Let me know if I can provide any additional information.

bjowes commented 10 months ago

I can't really get my head around how this all fits together. The simple fact that ntlm-proxy is able to reach example.cypress.io and api.cypress.io without any proper HTTP_PROXY/HTTPS_PROXY settings indicates that your corporate proxy isn't strictly needed to access the internet - otherwise these requests would just go nowhere.

Could you please investigate your proxy environment using curl? I think it is available by default with Windows nowadays. Test these:

There is some differences in the behaviour of ntlm-proxy when a corporate proxy is configured. When no such proxy is configured, all https requests to domains where no NTLM auth has been setup will be tunneled directly to the target for better performance. In the corporate proxy scenario this is not possible, since the traffic must first tunnel through the corporate proxy. My best guess so far is that capture.cypress.io does not play nicely with the kind of tunnel that ntlm-proxy uses.

Also, did you see this issue? I would guess that this is another case where the user is behind a corporate proxy, and capture.cypress.io won't play. If that is the case, cypress would then be using a similar tunnelling strategy as ntlm-proxy, which may be why it fails. Sadly no activity on that issue.

HTTP_PROXY and HTTPS_PROXY - since you redact the values I need to ask, are they set to the same thing? I saw in your logs that there were npm environment variables for both http and https, were they also the same? Just wanted to make sure that these are correctly mapped.

Something I am missing to ease troubleshooting these corporate proxy issues is logging to show what proxy configuration the ntlm-proxy is actually using - just to double check that it follows the expected config. You can get those logs yourself by: Locate cypress-ntlm-auth in your node_modules folder, modify /dist/proxy/main.js and add some additional debug printouts after

                this._debug.log("Startup done!");
                this._debug.log(ports);

to log the contents of httpProxy, httpsProxy and noProxy.

bjowes commented 10 months ago

I just tested running curl to capture.cypress.io by passing it through ntlm-proxy. For me, it gives the very same result as without proxying it. Sure, it's not the same request as when actually trying to upload the test replay, but the scenario for validating the certificate of capture.cypress.io is actually identical since that is performed with only the hostname:port. And apparently the cert validation succeeds here.

gabi-dobritescu commented 10 months ago

I've done the additional investigations. And it looks like I can access urls on cypress.io without passing through the proxy (which is very surprising).

Here are the curl command outputs: curl -v https://api.cypress.io curl-no-proxy-api.txt

curl -v https://capture.cypress.io curl-no-proxy-capture.txt

curl -v https://example.cypress.io curl-no-proxy-example.txt

curl -v -x [insert your https proxy url here] https://api.cypress.io curl-with-proxy-api.txt

curl -v -x [insert your https proxy url here] https://api.cypress.io curl-with-proxy-capture.txt

I've also added the extra debug logs as you suggested:

                this._debug.log("Startup done!");
                this._debug.log("httpProxy: " + httpProxy);
                this._debug.log("httpsProxy: " + httpsProxy);
                this._debug.log("noProxy: " + noProxy);
                this._debug.log(ports);

When no HTTP_PROXY environment variables are set the internal variables inside cypress-ntlm-auth are set to undefined. You can see that in this log file (after the Startup done! log entry): no-proxy-debug-logs.txt

When the HTTP_PROXY, HTTPS_PROXY and NO_PROXY environment variables are set the internal variables have the same values as the environment variables: proxy-set-debug-logs.txt.

Let me know if this brings more clarity into what's going on.

bjowes commented 10 months ago

Thanks for the logs. I see some rather unexpected behaviour in them.

It seems that in your network, some corporate proxy is active regardless if we try to proxy the traffic. Likely everything is routed through it anyway. This explains why it works without any proxy settings. However, I can also see that there is no HTTPS traffic at all. I presume the corporate proxy takes care of the HTTPS traffic to the cypress servers, and terminates the HTTPS at the proxy. Usually these proxies then provide a faked internal certificate for the target and provides HTTPS internally with this cert. This means that for the client, it appears as it is communicating with the actual server. But in your case, no TLS negotiation takes place, no certificate is sent. If you browse to example.cypress.io, do you get a padlock in the web browser, indicating a certificate?

As a reference, here is my curl call to api.cypress.io (I added the --http1.1 flag to mimic your case). As you can see, there is quite a lot of traffic regarding certificates.

curl -v --http1.1 https://api.cypress.io
*   Trying 104.22.10.239:443...
* Connected to api.cypress.io (104.22.10.239) port 443 (#0)
* ALPN: offers http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* (304) (IN), TLS handshake, Server hello (2):
* (304) (IN), TLS handshake, Unknown (8):
* (304) (IN), TLS handshake, Certificate (11):
* (304) (IN), TLS handshake, CERT verify (15):
* (304) (IN), TLS handshake, Finished (20):
* (304) (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / AEAD-AES256-GCM-SHA384
* ALPN: server accepted http/1.1
* Server certificate:
*  subject: C=US; ST=California; L=San Francisco; O=Cloudflare, Inc.; CN=cypress.io
*  start date: Apr  5 00:00:00 2023 GMT
*  expire date: Apr  4 23:59:59 2024 GMT
*  subjectAltName: host "api.cypress.io" matched cert's "*.cypress.io"
*  issuer: C=US; O=Cloudflare, Inc.; CN=Cloudflare Inc ECC CA-3
*  SSL certificate verify ok.
* using HTTP/1.1
> GET / HTTP/1.1
> Host: api.cypress.io
> User-Agent: curl/8.1.2
> Accept: */*
> 
< HTTP/1.1 404 Not Found

If my speculation is correct, that no cert is sent, it is not surprising that the cert validation fails. I don't see any trivial way to handle this, the ntlm-proxy is written to presume that https connections shall indeed use https. It is interesting that some connections still work like that (api.cypress.io and example.cypress.io), I would not really expect them to. Please investigate on your side if your corporate proxy really disables https internally, and if there is some way to have it enabled for your traffic. A theory is that there might be a specific endpoint to your corporate proxy for https traffic and you need to set HTTPS_PROXY to that, but this is just a hunch.

bjowes commented 8 months ago

Any update on this?

gabi-dobritescu commented 8 months ago

This is still an issue for me. I haven't had time to further investigate what's going on, but I would like to get this working sometime soon.

My next step will be to engage the network team at my company to help me make sense of the https traffic.

bjowes commented 5 months ago

Closing, will reopen if there is more input

bjowes commented 5 months ago

@gabi-dobritescu
Please retry this with Cypress 13.9.0. I managed to get a PR approved that could help with Cypress cloud connections.