Open airhorns opened 1 month ago
A different comment suggested that using the cypress/included
docker image would improve things -- I tried switching to it here but still was able to generate the same flake: https://github.com/gadget-inc/js-clients/actions/runs/9945955229/job/27475392300?pr=510 . I also think that's a bit of a red herring, as the browsers are successfully detected and tests successfully pass on the base github actions runner for me as well, it just isn't reliable, which suggests some sort of non-deterministic issue.
We're experiencing the same issue. At first we were running e2e against chrome
, firefox
and edge
. The edge
job failed to connect and after removing it from our matrix, firefox
started to fail.
│ Cypress: 13.13.0
│ Browser: Firefox 128 (headless)
│ Node Version: v20.13.1 (/home/runner/runners/2.317.0/externals/node20/bin/node)
@airhorns
Thanks for your repo and tests! There is a bug hiding in there somewhere!
It's good that you also tried out the Cypress Docker image and were able to reproduce. That means the bug is independent of Linux variant (Ubuntu 22.04 / Debian 12.6) and independent of Node.js version (v18.x / v20.x).
As a data point it's been relatively stable for us now that we're only using chrome
. Wonder if some flakiness is introduced by running multiple GHA jobs with different browsers concurrently?
@airhorns were you running multiple browsers or just chrome
?
Could it be a race condition related to the cache? 🤔
No multi-browser stuff, just using chrome
for now. It also seems to happen if I retry the exact same job a few times, which means that the same cache would be used as input each time I believe.
Hmm, so maybe a caching issue...
I'm not sure its a caching issue -- the browser discovery seems to have no problem at all finding the browsers in both the working and failing case. The only thing that stands out in the logs as strange to me is this:
cypress:launcher:browsers chrome stderr: [0715/191459.100182:ERROR:file_io_posix.cc(145)] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq: No such file or directory (2)
[0715/191459.100278:ERROR:file_io_posix.cc(145)] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq: No such file or directory (2) +36ms
which suggests chrome is struggling to check that file. Maybe the flakiness is introduced by certain underlying hardware running the GHA docker container, where some CPU platforms work and some don't? This chrome bug tracker thread has some details: https://issues.chromium.org/issues/40189632
Current behavior
Most of the time, cypress is able to boot a Chrome within a GitHub Actions runner and run tests. However sometimes, the run fails to boot Chrome and errors. This can happen with the exact same code running the exact same github actions yaml -- I just have to retry a few times to create this failure.
A failure looks like this:
Desired behavior
Cypress should always boot chrome, or if it can't for some reason, indicate why.
Test code to reproduce
I can't find what causes this to reliably reproduce -- it is very flakey. But I was able to capture an instance where it failed with
DEBUG=cypress:*
. See that log for this OSS project here: https://github.com/gadget-inc/js-clients/actions/runs/9941229702/job/27473079273Cypress Version
13.13.0
Node version
v18.19.1
Operating System
Ubuntu 22.04 within Github Actions
Debug Logs
Debug logs from run linked above: https://gist.github.com/airhorns/70757c597cd0f7fb540e06600a8fb4bc
Other
For the exact same project on the exact same config, there was no flakiness when starting Electron, but now there is starting chrome.