cypress-io / cypress

Fast, easy and reliable testing for anything that runs in a browser.
https://cypress.io
MIT License
47.01k stars 3.18k forks source link

Detect and Recover when Browser Hangs/Crashes/Dies #22631

Open emilyrohrbough opened 2 years ago

emilyrohrbough commented 2 years ago

Current behavior

Cypress does not handle browser tab crashes, hanging browsers or issues related to browsers unexpectedly dying. This cause Cypress to hang indefinitely until the process is manually stopped or CI times out.

Desired behavior

Cypress should handle tab crashes and timeout on browsers hangs.

The quick-(er) fix will be to fail the current test and pickup the next test to provide reporting on the tests that were able to run. The ideal solution would be re-attempting the test that experienced the crash to reduce test flake & CI costs for users and/or to help identify memory issues within the code under test.

Considerations to Keep in Mind

When the browser tab and/or instance is killed and re-launched, ensure we are release the node resources initially used to ensure JS memory does not grow with each launch.

It would be great if there was a way to capture the crash reason to provide users with better info (i.e. need to increase the memory with shm_size -- suggested as solution for #6695)

Test code to reproduce (chrome)

Can manually reproduce in Chrome in https://github.com/cypress-io/cypress-test-tiny/tree/issue-22506

  1. run npm run cypress:run-hang (enables browser debug logs with headed chrome)
  2. first spec runs, when cy.pause() starts, enter chrome://crash or chrome://hang in the URL to view behavior.

If running DEBUG=cypress* npm run cypress:run --browser chrome --headed you can see the full log output and the process_profiling logging continuously as Cypress hangs.

Cypress Version

Happening since v4.2. Current Version 10.3.0

Existing Issues Around This Behavior:

Issues to Do This Work:

Bug Reports:

emilyrohrbough commented 2 years ago

Chrome Investigation

It appears the launcher/lib/browser is logging the browser instance error but does nothing to allow the server/lib/browsers instance to use it to connect to the browser-cri-client to connect to the chrome-remote-interface to listen to events and handle opening the browser, launch tabs and standardizing exiting/killing the browser instance consistently between electron/firefox/chrome/edge.

The server/lib/browsers/chrome instance does not appear to listen to crash/hang messages to either close the tab and reopen it or to restart the browser instance to continue tests. Instead, Cypress hangs and uses resources (having a running Cypress instance + crash Chrome instance that's been run for 20 hours now). Because it is outside the scope of the mocha runner and we don't have logic to timeout due to Cypress hanging, Cypress doesn't timeout itself. In CI it seems people manually kill the process or the CI instance times out due to inactivity.

I have not tired to reproduce on Firefox, but suspect we have a similar issue. Total shot in the dark, but maybe the frequently observed Firefox is unable to connect issue. Maybe it is hanging and we aren't capturing the message to properly kill and restart the instance. Possible resource: https://github.com/bsmedberg/crashfirefox-intentionally

Puppeteer handles by throwing a page crash error.

How to crash chrome the browser

cypress:launcher:browsers:chrome stderr: [79726:259:0629/122233.586969:ERROR:chrome_debug_urls.cc(173)] Intentionally crashing (with null pointer dereference) because user navigated to chrome://crash/
cypress-verbose:server:browsers:cri-client:recv:[<--] received CRI message { method: 'Inspector.targetCrashed', params: {} }
cypress:server:browsers:chrome stderr: [32066:259:0630/090145.853211:ERROR:chrome_debug_urls.cc(199)] Intentionally hanging ourselves with sleep infinite loop because user navigated to chrome://hang/
no CRI message for hang

Resources:

Chrome errors:

robrich7 commented 2 years ago

Hi @emilyrohrbough, thank you so much for checking out this issue! It has been with us for months and is very frustrating.

What I don't understand is that it works locally on my laptop with npx cypress run, but as soon as cypress runs via docker image in a pipeline, it comes to these crashes. Can you please explain this to me?

robrich7 commented 2 years ago

@jennifer-shehane Hi Jennifer, can you please tell us if and when the problem will be fixed?

abezzubets commented 2 years ago

If you experience the issue with hanging tests please try disabling the Command Log: https://docs.cypress.io/guides/references/troubleshooting#Disable-the-Command-Log

It is helped me to solve the issue with hanging tests

cosmith commented 2 years ago

If you experience the issue with hanging tests please try disabling the Command Log: https://docs.cypress.io/guides/references/troubleshooting#Disable-the-Command-Log

It is helped me to solve the issue with hanging tests

It didn't help for us unfortunately.

pkalyan264 commented 1 year ago

Hey team, any updates or work arounds here?

SIGSTACKFAULT commented 1 year ago

I have the same problem but it's because of some sort of nasty memory leak which i have contrived a test to intentionally reproduce

rasis2 commented 1 year ago

Hi, just checking if there's a progress on this issue?

pat-convex commented 1 year ago

Any news about this crashing ?? or any work around ?