grafana / xk6-browser

k6 extension that adds support for browser automation and end-to-end web testing via the Chrome Devtools Protocol
https://grafana.com/docs/k6/latest/javascript-api/k6-experimental/browser/
GNU Affero General Public License v3.0
325 stars 41 forks source link

Dispose panic after clicking on an element inside another frame #1089

Open inancgumus opened 8 months ago

inancgumus commented 8 months ago

Brief summary

We receive a panic after clicking on an element inside another frame.

Early investigation result

The issue seems to be happening because we dispose of the main document after closing the browser with Runtime.releaseObject.

xk6-browser version

v1.1.0

Steps to reproduce the problem

  1. Turn the headful browser option on.
  2. Run the following script.
export const options = {
  scenarios: {
    ui: {
      // mandatory fields; executor, options-->browser-->type
      executor: "shared-iterations",
      vus: 1,
      iterations: 1,
      options: {
        browser: {
          type: "chromium",
        },
      },
    },
  },
};

export default async function () {
  const page = browser.newPage();

  try {
    await page.goto("https://www.directline.com/my-dashboard#/sign-in", {
      waitUntil: "networkidle",
    });

    await Promise.all([
      page.locator('button[id="onetrust-accept-btn-handler"]').click(),
    ]);

    let frames = await page.frames();

    for (const frame of frames) {
      if (frame.title() === "reCAPTCHA" && frame.name().startsWith("a")) {
        const spanElement = frame.locator("#recaptcha-anchor");
        spanElement.focus();
        spanElement.dispatchEvent("click");
      }
    }
  } finally {
    page.close();
  }
}

Expected behaviour

No panic.

Actual behaviour

Panics with:

panic: GoError: dispose: canceled

goroutine 100 [running]:
go.k6.io/k6/js/common.Throw(...)
    go.k6.io/k6@v0.45.1-0.20230804150043-5b5c5ccbb869/js/common/util.go:20
github.com/grafana/xk6-browser/k6ext.Panic.func1(0x101441de0?, {0x14002b0a9a0?, 0x101426601?, 0x14001719300?})
    github.com/grafana/xk6-browser@v0.10.0/k6ext/panic.go:35 +0x7c
github.com/grafana/xk6-browser/k6ext.sharedPanic({0x10167dbf8, 0x14001109540}, 0x14002663d30, {0x14002b0a9a0?, 0x1, 0x1})
    github.com/grafana/xk6-browser@v0.10.0/k6ext/panic.go:64 +0x1fc
github.com/grafana/xk6-browser/k6ext.Panic({0x10167dbf8?, 0x14001109540?}, {0x1010419e6?, 0x100957dc8?}, {0x14002b0a9a0?, 0x100958f60?, 0x14002362068?})
    github.com/grafana/xk6-browser@v0.10.0/k6ext/panic.go:37 +0x58
github.com/grafana/xk6-browser/common.(*BaseJSHandle).Dispose(0x14000713f40)
    github.com/grafana/xk6-browser@v0.10.0/common/js_handle.go:70 +0xa0
github.com/grafana/xk6-browser/common.(*Frame).detach(0x140018701c0)
    github.com/grafana/xk6-browser@v0.10.0/common/frame.go:196 +0x74
github.com/grafana/xk6-browser/common.(*FrameManager).removeFramesRecursively(0x14002106000, 0x14001666080?)
    github.com/grafana/xk6-browser@v0.10.0/common/frame_manager.go:415 +0xd8
github.com/grafana/xk6-browser/common.(*FrameManager).frameDetached(0x14002106000, {0x14001666080?, 0x3?}, {0x101039af3, 0x6})
    github.com/grafana/xk6-browser@v0.10.0/common/frame_manager.go:186 +0x8c
github.com/grafana/xk6-browser/common.(*FrameSession).onFrameDetached(0x1400197e160, {0x14001666080, 0x20}, {0x101039af3, 0x6})
    github.com/grafana/xk6-browser@v0.10.0/common/frame_session.go:710 +0x5c
github.com/grafana/xk6-browser/common.(*FrameSession).initEvents.func1()
    github.com/grafana/xk6-browser@v0.10.0/common/frame_session.go:237 +0x4a4
created by github.com/grafana/xk6-browser/common.(*FrameSession).initEvents in goroutine 42
    github.com/grafana/xk6-browser@v0.10.0/common/frame_session.go:211 +0x104
### Tasks
- [ ] https://github.com/grafana/xk6-browser/pull/1261
- [ ] https://github.com/grafana/k6/pull/3709
ankur22 commented 2 months ago

I have a good idea what is happening, and I believe iframes play a part.

While testing the fix for https://github.com/grafana/xk6-browser/issues/1252 with https://github.com/grafana/xk6-browser/pull/1254 I noticed an increase in panic: GoError: dispose: canceled errors, which is the same error that we get with the test script that is attached to this current issue.

In my investigation I've sprinkled some fmt.Println commands as well as converting debug commands to info to help debug the issue. You can find the changes in https://github.com/grafana/xk6-browser/pull/1261.

When running the script in this issue with the branch provided at the commit before the fix (2f5ea59fa1003837f661aad9cdb1cd820ddb2c3c), you will see the following logs (redacted most of the logs and only showcasing the ones that are valuable to demonstrate the issue):

>>>>>>: iteration has started
INFO[0000] sid:B0A623E49AD83DDF962ABA3DB5B0F985 tid:9D9E1DBAB974AFC59F5242441C6D9217 method:"Network.enable"  category="Session:Execute" elapsed="0 ms" source=browser
...
// Redacted logs which don't help explain the issue
...
>>>>>>: page has closed
>>>>>>: ctx.Done closed VUs, 0 complete and 0 interrupted iterations
>>>>>>: iteration has ended-------------------] 1 VUs  00m03.1s/10m0s  0/1 shared iters
INFO[0003] sid:B0A623E49AD83DDF962ABA3DB5B0F985 tid:9D9E1DBAB974AFC59F5242441C6D9217 fid:4413B90E08F5D5F0F3363F0CF406C867 reason:remove  category="FrameSession:onFrameDetached" elapsed="2 ms" source=browser
INFO[0003] sid:B0A623E49AD83DDF962ABA3DB5B0F985 tid:9D9E1DBAB974AFC59F5242441C6D9217 fid:74F0A96ABF60DA5402A2F4FF05F721FE reason:remove  category="FrameSession:onFrameDetached" elapsed="1 ms" source=browser
>>>>>>: BaseJSHandler Dispose
INFO[0003] sid:D3FF5021B54A30A306FBB0C6377CE673 tid:74F0A96ABF60DA5402A2F4FF05F721FE method:"Runtime.releaseObject"  category="Session:Execute" elapsed="0 ms" source=browser
INFO[0003] sid:D3FF5021B54A30A306FBB0C6377CE673 tid:74F0A96ABF60DA5402A2F4FF05F721FE method:"Runtime.releaseObject"  category="Session:Execute:s.conn.send" elapsed="0 ms" source=browser
INFO[0003] sid:D3FF5021B54A30A306FBB0C6377CE673 tid: wsURL:"ws://127.0.0.1:60490/devtools/browser/8186b7f3-d051-4cef-b414-36c1237a2a3f" err:context canceled  category="Connection:send:<-ctx.Done()" elapsed="0 ms" source=browser
>>>>>>: BaseJSHandler Dispose error disposing element with ID -6869963927014979433.1.1: context canceled
panic: GoError: dispose: canceledmplete and 0 interrupted iterations
ui     [--------------------------------------] 1 VUs  00m03.1s/10m0s  0/1 shared iters
goroutine 119 [running]:
go.k6.io/k6/js/common.Throw(...)
        go.k6.io/k6@v0.49.1-0.20240213133040-b5a6febd5638/js/common/util.go:20
github.com/grafana/xk6-browser/k6ext.Panic.func1(0x1400028d408, {0x140023915c0?, 0x1036e0301?, 0x14001888100?})
        github.com/grafana/xk6-browser@v1.3.0/k6ext/panic.go:35 +0x7c
github.com/grafana/xk6-browser/k6ext.sharedPanic({0x1039df188, 0x14000506230}, 0x140003a9a10, {0x140023915c0, 0x1, 0x1})
        github.com/grafana/xk6-browser@v1.3.0/k6ext/panic.go:64 +0x20c
github.com/grafana/xk6-browser/k6ext.Panic({0x1039df188?, 0x14000506230?}, {0x10326ff46?, 0x2?}, {0x140023915c0?, 0x102ccd7e0?, 0x140000ff568?})
        github.com/grafana/xk6-browser@v1.3.0/k6ext/panic.go:37 +0x58
github.com/grafana/xk6-browser/common.(*BaseJSHandle).Dispose(0x1400096a7d0)

The key points are:

  1. The iteration starts;
  2. Things happen that are expected for the test;
  3. Test completes and the page is closed (>>>>>>: page has closed);
  4. k6 ends the iteration since the test has come to the end (>>>>>>: iteration has ended);
  5. Chrome is still sending messages back to the browser over the open session (FrameSession:onFrameDetached, >>>>>>: BaseJSHandler Dispose);
  6. The browser module tries to dispose of the objects via CDP ("Runtime.releaseObject");
  7. Fails to release the object since the context has canceled (due to the iteration ending) (>>>>>>: BaseJSHandler Dispose error disposing element with ID -6869963927014979433.1.1: context canceled);
  8. panic occurs.

Looking at these steps it feels to me that it is clear that the FrameSession:onFrameDetached is coming in too late for this particular case, and it would be better if that came in before the iteration ended.

The two options we have to resolve this issue are:

  1. Somehow delay the iteration ending to prevent the error/panic.
  2. Since the iteration has ended, the page, browserContext and (for now) the Chrome instance are all closed, and cleaned up. So there's no need to try and release the object, or at least there's no point in panicking for this case.

I'm not entirely sure of the iframes lifecycle, but there is enough of a delay in closing the page and receiving the detached frame messages from Chrome.

My preference would be to check for the context closed error in dispose (such as in beba74e3e7abe4a737295deac64cde85e38ec7a7) and ignore those types of errors. When this fix is in place, the test ends with no errors or panics.

Another question is why does the module even perform a releaseObject CDP action when a frame is detached? Looking at the Playwright code, I don't see the releaseObject being called when a frame is detached onFrameDetached.

inancgumus commented 2 months ago

Thanks for the investigation, @ankur22 đŸ™‡ I think the first solution is better since the iteration shouldn't end before all the resources are cleaned up (the context canceled error shouldn't reach to the disposal stage (ideally!). However, the second solution (the current one) is good enough, as there is no reason to panic after the context is canceled. So, I'm happy with the second approach.

inancgumus commented 2 months ago

@ankur22

Another question is why does the module even perform a releaseObject CDP action when a frame is detached? Looking at the Playwright code, I don't see the releaseObject being called when a frame is detached onFrameDetached.

Playwright has an involved handling of handle disposition. It might give us a better idea if we could run a Playwright script and watch the CDP messages sent by Playwright to see if Playwright sends a ReleaseObject CDP message to the browser.

ankur22 commented 2 months ago

Playwright has an involved handling of handle disposition. It might give us a better idea if we could run a Playwright script and watch the CDP messages sent by Playwright to see if Playwright sends a ReleaseObject CDP message to the browser.

I've looked at the code, and can't see it happen, i can double check with some logs of the CDP commands though.

Any idea why we need this in the first place?

inancgumus commented 2 months ago

Any idea why we need this in the first place?

Yep. That's weird because when a frame navigates, the execution context drops, and the browser should automatically release the handles within the execution context. We could create a web page that creates millions of objects and then navigates (detaches the frame) to see if it impacts the browser memory used. Let me know if I can help.