microsoft / playwright

Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.
https://playwright.dev
Apache License 2.0
66.99k stars 3.68k forks source link

[BUG] page.goto: Timeout XXXms exceeded. No load or domcontentloaded event fired #12182

Open soloyal opened 2 years ago

soloyal commented 2 years ago

Context:

const { firefox } = require('playwright');

(async () => {
  const browser = await firefox.launch({
    headless: false,
    args: [
      '--width=1280',
      '--height=720',
    ],
  });
  const context = await browser.newContext();
  const page = await context.newPage();
  page.setDefaultTimeout(60000);
  // fire async navigation
  await page.goto('https://www.nike.com/', {
    waitUntil: 'load',
  });
  console.log('Done')
  await page.close()
  await browser.close()  
})();

Script opens the nike.com url but stuck on goto. No additional requests in the devtool. Visually it stopes the loading, web site looks correct (with all css/js), but got never finishes

DEBUG=pw:api node index.js
 pw:api => browserType.launch started +0ms
  pw:api <= browserType.launch succeeded +4s
  pw:api => browser.newContext started +1ms
  pw:api <= browser.newContext succeeded +23ms
  pw:api => browserContext.newPage started +1ms
  pw:api   navigated to "about:blank" +619ms
  pw:api   navigated to "about:blank" +38ms
  pw:api <= browserContext.newPage succeeded +6ms
  pw:api => page.goto started +1ms
  pw:api navigating to "https://www.nike.com/", waiting until "load" +6ms
  pw:api   "commit" event fired +862ms
  pw:api   navigated to "https://www.nike.com/il/" +0ms
  pw:api   "commit" event fired +2s
  pw:api   navigated to "https://unite.nike.com/session.html?appVersion=912&experienceVersion=912" +0ms
  pw:api   "commit" event fired +107ms
  pw:api   navigated to "https://api.nike.com/149e9513-01fa-4fb0-aad4-566afd725d1b/2d206a39-8ed7-437e-a3be-862e0f06eea3/fp" +0ms
  pw:api   "commit" event fired +23ms
  pw:api   navigated to "https://unite.nike.com/149e9513-01fa-4fb0-aad4-566afd725d1b/2d206a39-8ed7-437e-a3be-862e0f06eea3/fp" +1ms
  pw:api   "commit" event fired +5s
  pw:api   navigated to "https://www.googletagmanager.com/ns.html?id=GTM-NTF2X45" +0ms
  pw:api   "commit" event fired +828ms
  pw:api   navigated to "about:blank" +0ms

  pw:api   "commit" event fired +17ms
  pw:api   navigated to "about:blank" +1ms

  pw:api   "commit" event fired +51ms
  pw:api   navigated to "https://bid.g.doubleclick.net/xbbe/pixel?d=KAE" +0ms
  pw:api   "commit" event fired +29ms
  pw:api   navigated to "https://tr.snapchat.com/cm/i?pid=20634084-7fa1-4691-8aec-fefe00263e00" +0ms
  pw:api   "commit" event fired +406ms
  pw:api   navigated to "https://www.pinterest.com/ct.html" +0ms
  pw:api <= page.goto failed +49s
node:internal/process/promises:246
          triggerUncaughtException(err, true /* fromPromise */);
          ^

page.goto: Timeout 60000ms exceeded.
=========================== logs ===========================
navigating to "https://www.nike.com/", waiting until "load"
============================================================
    at /Users/navin/Downloads/goto/index.js:19:14 {
  name: 'TimeoutError'
}

same behavior for waitUntill: domcontentloaded or any other

dgozman commented 2 years ago

I can repro, and only in Firefox.

soloyal commented 2 years ago

Ok, but how to solve it?

krutoo commented 2 years ago

same problem

pepehandsjpg commented 2 years ago

i'm also having the same problem (firefox)

dgattey commented 2 years ago

I'm also getting this on Webkit + Firefox. Here's a run that exhibited this problem: https://github.com/shopcanal/e2e-tests/runs/5682749027?check_suite_focus=true

  pw:api <= browserContext.clearCookies succeeded +1ms
  pw:api => page.goto started +1ms
  pw:api navigating to "https://develop.shopcanal.com/login", waiting until "load" +1ms
  pw:api   "commit" event fired +219ms
  pw:api   navigated to "https://develop.shopcanal.com/login" +0ms
  pw:api   navigated to "https://develop.shopcanal.com/login" +329ms
  pw:api   navigated to "https://develop.shopcanal.com/login" +193ms
  pw:api   "domcontentloaded" event fired +21ms
  pw:api => browserContext.close started +59s
  pw:api <= page.goto failed +8ms
  pw:api <= browserContext.close succeeded +1ms
tillschander commented 2 years ago

I have the same problem with Chromium on Ubuntu. So it's not just a Firefox only issue.

Every time i run my tests one or two of them might fail randomly with the same waiting until load error:

page.waitForNavigation: Navigation failed because page was closed!
=========================== logs ===========================
waiting for navigation until "load"
============================================================
page.goto: Navigation failed because page was closed!
=========================== logs ===========================
navigating to "http://localhost:3000/some/url", waiting until "load"
============================================================

The corresponding code is nothing fancy. Here are some examples:

await Promise.all([
    page.waitForNavigation({ url: 'login' }),
    page.goto('login')
]);
await page.goto('search');
await expect(page.locator('h1')).toContainText('Search');
await page.goto('product/123456');
await expect(page.locator('h1')).toContainText('Product Title');
liuxingbaoyu commented 2 years ago

Fixed in #13340

zhanglei4333 commented 2 years ago

Hello, is this problem solved now? Can I download the latest package to solve this problem, or modify my case according to the method of Liuxing Baoyu

jatin-karla11 commented 2 years ago

@tillschander we are also facing this issue on our Jenkins pipeline jobs, and again this is totally random any test gets error at any time, out of 50 test cases 2-3 fails due to this (2 workers), and if I increase the number of workers gradually then this count keeps increasing. (in all browsers). But if we run tests in local everything works fine. Any solution for this issue as this is impacting our daily automation build results. @zhanglei4333 @liuxingbaoyu @aslushnikov

ajayEngProd commented 2 years ago

Hello , we are also facing the same issue with timeouts happening randomly. We are running close to 60k tests across multiple machines in 20 minutes. This problem happens when tests run in a pipeline and not when they are running in local.

OS: windows playwright version : 1.13.0 Dot net version : 5 and 6

any help in this regard will be very useful @zhanglei4333 @liuxingbaoyu @tillschander .

daddyman commented 2 years ago

We are getting the same behavior of random failures. We have ~1500 tests that run on a single worker. We are running under Jenkins using the Docker image.

What we see is one or two tests on almost every test run fail with a message similar to:

Timeout of 60000ms exceeded.

page.waitForNavigation: Navigation failed because page was closed!
=========================== logs ===========================
waiting for navigation until "load"
  navigated to "[http://172.24.142.157/designer/"](http://172.24.142.157/designer/)
============================================================

It is at the point where the test navigates to a page and does waitForNavigation.

I don't have a repro because of the random nature of the failures and it doesn't happen on the same page each time.

sahil1610 commented 2 years ago

Hello,

I am also getting a similar issue with https://app.vwo.com on Chromium, Firefox and Webkit. Post the login, the URL is not getting redirected to the Dashboard and in the network panel, there is no call after the post Login call which gives 200.

Steps:

  1. Land on https://app.vwo.com
  2. Enter the Username and password and wait for redirection to dashboard page. (For username and password, you can simply create a free trial account on https://vwo.com.

Environment:

  1. MacOS
  2. Playwright Version 1.22
  3. Browser - Chromium, Firefox and Webkit
  4. Language binding - Javascript+Typescript

On Chromium, Firefox, and Safari the issue is intermediate, while on the local selenium 4 grid it's consistent.

daddyman commented 2 years ago

I turned on DEBUG=pw:api and the failure happens much less often but it did happen a couple of times. Both times while clicking a button to navigate to a new page (using Promise.all with page.waitForNavigation) a silent refresh of the access token occurred . Could this have something to do with the failure?

The attached file has the log output from the test that failed. Near the bottom is a comment "FAILURE STARTS HERE". The page.waitForNavigation doesn't complete. In another run where this test succeeds, without the token refresh in the middle, there is an extra log line mentioned in the file that says the page.waitForNavigation succeeded.

silent-refresh.txt

daddyman commented 2 years ago

I changed by tests to use DEBUG=pw:channel* and caught a test that succeeded once and failed once. In that test there is a page already loaded with a list page when the test starts. During the test:

In the failure case while the breadcrumb text is being verified the refresh token timer fires and the token renewal process starts (same as the previous test I posted about). This seems to cause problems in the last step of navigating back to the list page.

These two files are a successful run of the test and the failed run of the test. I commented the progress of the text. mappingdata-fail.txt mappingdata-success.txt

Edit: This is on playwright 1.22.2 running headless under Docker with Jenkins and running under Chromuium.

daddyman commented 2 years ago

I opened a new issue specific to my case #15084 (a token refresh occurring during a test would add and remove an iframe which caused problems in playwright).

There is a pull request #15812 that fixed my problem when I manually applied it.

KayakinKoder commented 2 years ago

We were experiencing the same problem, only in our CI pipelines (tests working locally, failing randomly in pipelines). The following in our playwright config has seemed to fix the problem for us:

// limit the number of workers running. start with 1, and if your tests pass reliably, consider increasing to
// speed up runs. the new % option may be a good choice, e.g.  workers: '50%'
workers: 1,

use: {
    // set this to a large number, to account for pages occasionally loading more slowly in your pipeline
    // than they do locally
    actionTimeout: 12000,

Based on the comments above and our experience, my thought is that the vm that's running our tests in ci pipeline just gets a bit overwhelmed in terms of resources. Tons of workers spawning new tests, all running at once, and one just hangs up a bit because ram and cpu are scarce.

An additional source of this problem could be the fact locally, we're running tests on localhost; in our pipeline, we're running tests on a real staging website. Websites are fickle, and pages can sometimes load slowly. Increasing the actionTimeout may help with that.

If you're still getting failures, you could consider allowing retries. In an ideal world, tests that should pass would pass 100% of the time. In the real world, even if you're running your webapp within a container in your ci pipeline and testing against that (rather than running your tests against a real website), sometimes a page just doesn't load correctly.

retries: 1
ChristianUlbrich commented 1 year ago

Just to chime in a little bit on adding some insights, why your tests might fail on CI and seem to be flaky (or blaming Playwright for that matter) - for us, we are using pure ESM modules and if all of your dependencies are native ESM and are optimized for tree-shaking (meaning every tiny function is exposed as ESM module), if you do not bundle anything at all, this easily amounts to crazy amounts of requests being made:

In our scenario, where we "just" user 5 components from our UI library, this amounts to +150 requests. Browsers limit the amount of parallel HTTP requests and if HTTPS involved, this is not going to get any faster... So what I am saying is, that even my beefy (sorry veggies!) MacBook takes roughly a second for loading. We limited the Playwright timeouts to 4s and it seems, that hosted AzureDevOps Mac OS agents have pretty bad I/O performance, read by factor 5-10 slower, so this explains for us failed timeouts, because the load event is not triggered in time, because esm modules need to be "resolved" before actual DOM processing can happen.

We might think about using bundles for our tests; but for the time being, we solved this by having a platform-dependent global timeout set for Playwright.

lavcraft commented 1 year ago

Same issue in chromium with fresh playwright version 1.31.2 :( Also, it's a little bit flaky

swoopgrandchamp commented 1 year ago

Same issue with 1.32.3, out of 7 tests, 1 random one has always this problem.

auxsvr commented 1 year ago

This may be caused by https://github.com/jfhbrook/pyee/issues/120. I've seen many cases of lost events with playwright and fixed all of them by patching pyee to keep a reference to the futures corresponding to the events, otherwise they get garbage-collected.

rob4629 commented 1 year ago

Similar to others, I'm seeing this issue on the latest (1.33.x) version. It's only happening in CircleCi runs, but consistently fails during the page.goto step for Safari (webkit) browser, and inconsistently on Chrome. We have a timeout of 60s, so as to try and combat performance issues.

Also like others have seen, the screenshots show the page as loaded... so it's just the load event that it fails find.

Again, no issues running locally.

saranya-krish commented 1 year ago

Facing similar issue.

Intermittently tests fail only in pipeline while calling page.goto

Using Playwright version 1.31.0.

How to resolve this?

jalexakos commented 1 year ago

Facing this issue as well - the tests timeout for some users on their local machine, but not others. I'm curious to see what's going on. It seems like the Playwright trace viewer no longer displays how long network calls took - that would be a great feature to bring back.

EugeneMac commented 1 year ago

Similar issue when running on TeamCity agent. Playwright 1.31.1. Happens mostly in page.WaitForURLAsync() method after clicking Login button and redirecting to homepage (timeout exception: waiting for navigation to "***" until "Load"). Seems like redirecting mechanism somehow affects a current page. As a result, target page is loaded, but the WaitForURLAsync method fails...

fayasfb commented 1 year ago

same issue Error: page.waitForLoadState: Navigation failed because page was closed! =========================== logs =========================== "domcontentloaded" event fired "networkidle" event fired please give the solution , its working in local , getting this error in github actions.

methaqualon commented 1 year ago

Somewhy from my logs I recognized that my page on vps passes 'load' event after much time, so now in this situation without fix I'm preferring to use this lifehack: node_modules/playwright-core/lib/client/frame.js change in function goto default value of waitUntil to smth else from expected values, 'commit' in my case and waitForTimeout with fixed value

in my case there are client-side redirect. First page throws 'load' event fastly, but that where i am redirected is heavy page with some lazyload things, so i'm getting Timeout exceeded shit in 80-85% cases. It is not a fix at all, but can help people not to waste money now when team is working on it. Hope team will fix it easily and fastly. upd: things like page.waitForURL(); can do smth good to in my situation

4ayushsinghal commented 1 year ago

I am also facing the same issue when I run on linux with chromium. I am trying to use page.setContent with a new page on browser context Here is the corresponding code

context = await browser.newContext()
const renderedPage = await context.newPage()
const frame = await (await this.page.$('iframe')).contentFrame()
await renderedPage.setContent(await frame.content())
expect(await renderedPage).toHaveScreenshot('pa-template', { fullPage: true })

I get the below error:

Error: page.setContent: Navigation failed because page was closed!
=========================== logs ===========================
setting frame content, waiting until "load"
============================================================

 129 |       expect(await paEmail.subject).toBe(actualSubject)
 130 |       const renderedPage = await context.newPage()
> 131 |       await renderedPage.setContent(await paEmail.body)
    |                          ^
 132 |       await expect(renderedPage).toHaveScreenshot('pa-email', { fullPage: true })
 133 |     })
 134 |   })

   at /home/runner/work/test-automation/test-automation/tests/nextboard/communications.spec.js:131:26
   at /home/runner/work/test-automation/test-automation/tests/nextboard/communications.spec.js:120:5
jfuinsure commented 1 year ago

Also intermittenly seeing NS_ERROR_NET_TIMEOUT with Firefox and webkit (Desktop Safari) when trying to navigate to my test site

await this.page.goto(this.site.testUrl);

tomdelahaba commented 1 year ago

I have very little to add to this topic, but MAYBE (hopefully?) it will help to someone.

We faced very similar issues just right now, I was pretty clueless why we get such timeouts, when everything is working locally, but on our environment, we faced such a random timeouts as well. Sometime, it was working, sometime it was not. It was very weird.

Well, TLDR... Most of the providers (mainly cloud ones) have some kind of limiting rules for amount of queries. In such a situation, if playwright will hit this limit (lets say 100 queries per minute), usually the server (to avoid DDOS) serve you 403 or any other page. In such a situation, playwright will be stuck on goto('') or any assertion, because that assertion won't be truthy. For our particular use-case, the situation was as well, that playwright was not rendering anything because of our 403 page...

So keep on mind as well that such a server infrastructure / logic can (potentially) be the problem as well. Mainly if you do not face these problems locally.

tgrisley commented 10 months ago

Have also been having this issue within CircleCi, tests intermittently get stuck on goTo() and then test will timeout, upping the test timeout does not help with this. If i check screenshots / trace view i can see that the page has loaded correctly.

Screenshot 2024-01-12 at 14 19 59 Screenshot 2024-01-12 at 14 21 49

Using provided MS docker image but have reduced the workers from 2 to 1, still no difference.

    docker:
      - image: mcr.microsoft.com/playwright:v1.40.0-jammy

I can put forward a simple repro if that would help as this still seems to be an issue

benskz commented 9 months ago

Not a fix by any means but if anyone needs an emergency hacky workaround this might help:

try {
  await page.goto(url, {
    timeout: 5000,
  });
} catch {
  // sometimes goto times out, so try again
  console.log('page.goto timed out, trying again');
  await page.goto(url);
}
gantispam commented 9 months ago

Same issue

GerasimovDG commented 8 months ago

UP! We have the same issue. Everything works well locally, but we get 'navigating to [url] waiting until "load"' in our CI

fredericoo commented 8 months ago

Having this issue to out of nowhere, fixed by adding { waitUntil: 'domcontentloaded' } to the params of the goto call. It seems it was just waiting forever. Will look into it deeply but for now just needed something working

margarytaD commented 8 months ago

Also time to time see this issue in all browsers

tokyo-watcher commented 7 months ago

In the case of Firefox, isn't there a problem where it doesn't accept test instructions correctly if the window is not in the forefront?

oscarm081297 commented 7 months ago

Also is happening to me using GitlabCI - Tests were working last week and suddenly I am getting timeout with page.goto() Locally everything is working fine, I've upgrade and also downgrade the playwright version but issue still persists. I've enabled the debug mode and this is what I am getting:

navigating to ".......", waiting until "load" +2ms
  pw:api   "commit" event fired +1s
  pw:api   navigated to "..........." +0ms
  pw:api   "domcontentloaded" event fired +1s
  pw:api   navigated to "about:blank" +1s
  pw:api   navigated to "about:blank" +2ms
  pw:api   "commit" event fired +305ms
  pw:api   navigated to ".........." +0ms
  pw:api   "domcontentloaded" event fired +696ms
  pw:api   navigated to "about:blank" +1s
  pw:api   navigated to "about:blank" +34ms
  pw:api <= page.goto failed +2s

I can see in the video recording that the test was able to open the page (page is loading properly) but for some reason it's firing timeout when test is being executed on CI. Also, I've tried placing the page.goto() in different places (beforeEach, test level, etc) and is throwing timeout always, checked the traces but everything looks good. This is becoming a blocker since we can not execute any test on CI - every single test is crashing because of the timeout, I tried by increasing it but not working either.

Any insights?

StaRenn commented 7 months ago

Also happening when you try to run tests in docker container, with dev server running on host machine, using --network="host" and host.docker.internal as url, had to set timeout to 300s, which is not good, chromium

xiaoxiaocaiiao commented 6 months ago

playwright 1.33.0, chrome still have this issue

dtokarczyk commented 6 months ago

I noticed that it happens only in tests run in a docker container and only if I run a lot of tests. Interestingly Linux users do not report issues, only mac users.

jameslporter commented 5 months ago

Can confirm this issue is affect linux users. I was setting up a docker container and trying playwright for the first time. Insanely frustrating. Finally figured out I'm not a total moron but simply a significant regression in playwright itself. I know npx is all the rage these days but it leads to users running the latest version. I was able to get my tests working by using npx playwright@1.43.1 install and then npx playwright@1.43.1 test after also adjusting the package.json dep to specifically 1.43.1 version.

kuba-orlik commented 5 months ago

In my case updating Playwright to the latest versoin solved the issue, fyi

LuvDeluxe commented 4 months ago

This is still present when executed in Bitbucket CI/CD... Locally I am unable to reproduce...

dtokarczyk commented 4 months ago

I noticed that docker system prune --volumes sometimes solve problems

JorensM commented 3 weeks ago

I think this happens because of redirects. Does anyone experiencing this issue have redirects by any chance?

lucien-theron commented 2 weeks ago

I also have this issue using Chrome and Ubuntu (also using official container). The video shows the page freeze or hang a second after opening. This only happens on our Ubuntu env tests.

vladide commented 2 weeks ago

I have performed some tests on local vs local docker vs k8s pod

Hence recommend that those who see random TIMEOUTS (Test timeout of 30000ms exceeded) - check the resource usage

JorensM commented 2 weeks ago

This happens for me on locally on Windows btw, less often on UI mode though. Also happens in GitLab CI with Linux

lucien-theron commented 1 week ago

I now also get these locally on Windows but a lot more rarely