Open imiric opened 3 years ago
I tried to add a test for the 1001 (going away)
status. Let's keep it here only for reference.
func TestConnectionGoingAway(t *testing.T) {
const wantErr = "websocket: close 1001" // going away
done := make(chan struct{})
server := testutils.NewWSTestServer(t,
"/going-away",
http.HandlerFunc(func(w http.ResponseWriter, req *http.Request) {
conn, err := (&websocket.Upgrader{}).Upgrade(w, req, w.Header())
require.NoError(t, err)
conn.WriteControl(
websocket.CloseMessage,
websocket.FormatCloseMessage(websocket.CloseGoingAway, ""),
time.Now().Add(time.Second),
)
<-done
defer func() {
done <- struct{}{}
}()
var any interface{}
err = conn.ReadJSON(any)
require.ErrorContains(t, err, wantErr)
}))
defer server.ServerHTTP.Close()
url, _ := url.Parse(server.ServerHTTP.URL)
wsURL := fmt.Sprintf("ws://%s/going-away", url.Host)
ctx := context.Background()
conn, err := NewConnection(
ctx,
wsURL,
NewLogger(ctx, NullLogger(), false, nil),
)
require.NoError(t, err)
// close the connection before the client has a chance to read
// from it. this will result in a "going away" error.
conn.Close()
done <- struct{}{}
action := target.SetDiscoverTargets(true)
err = action.Do(cdp.WithExecutor(ctx, conn))
require.ErrorContains(t, err, wantErr)
<-done
}
This is what I saw after running the test for half an hour on my Mac:
--- FAIL: TestConnectionClosureAbnormal (0.10s)
--- FAIL: TestConnectionClosureAbnormal/closure_abnormal (0.07s)
connection_test.go:67:
Error Trace: connection_test.go:67
Error: Error message not equal:
expected: "websocket: close 1006 (abnormal closure): unexpected EOF"
actual : "websocket: close 1001 (going away)"
Test: TestConnectionClosureAbnormal/closure_abnormal
--- FAIL: TestConnectionClosureAbnormal (1.04s)
--- FAIL: TestConnectionClosureAbnormal/closure_abnormal (1.01s)
connection_test.go:67:
Error Trace: connection_test.go:67
Error: Error message not equal:
expected: "websocket: close 1006 (abnormal closure): unexpected EOF"
actual : "websocket: close 1001 (going away)"
Test: TestConnectionClosureAbnormal/closure_abnormal
--- FAIL: TestConnectionClosureAbnormal (2.00s)
--- FAIL: TestConnectionClosureAbnormal/closure_abnormal (2.00s)
connection_test.go:67:
Error Trace: connection_test.go:67
Error: Error message not equal:
expected: "websocket: close 1006 (abnormal closure): unexpected EOF"
actual : "websocket: close 1001 (going away)"
Test: TestConnectionClosureAbnormal/closure_abnormal
panic: test timed out after 10m0s
It happens while sending to or receiving from CDP. It's probably an I/O timeout issue. So we might want to give it a timeout somewhere in the code (probably in the connection.go
):
goroutine 451 [select, 9 minutes]:
github.com/grafana/xk6-browser/common.(*Connection).sendLoop(0xc000ce41c0)
/Users/inanc/xk6-browser/common/connection.go:351 +0x1d9
created by github.com/grafana/xk6-browser/common.NewConnection
/Users/inanc/xk6-browser/common/connection.go:148 +0x4d5
goroutine 74 [IO wait, 10 minutes]:
internal/poll.runtime_pollWait(0x6e00bf0, 0x72)
/usr/local/Cellar/go/1.17.1/libexec/src/runtime/netpoll.go:229 +0x89
internal/poll.(*pollDesc).wait(0xc000138498, 0xc0001fe000, 0x0)
/usr/local/Cellar/go/1.17.1/libexec/src/internal/poll/fd_poll_runtime.go:84 +0xbd
internal/poll.(*pollDesc).waitRead(...)
/usr/local/Cellar/go/1.17.1/libexec/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc000138480, {0xc0001fe000, 0x1000, 0x1000})
/usr/local/Cellar/go/1.17.1/libexec/src/internal/poll/fd_unix.go:167 +0x419
net.(*netFD).Read(0xc000138480, {0xc0001fe000, 0x1000, 0x1000})
/usr/local/Cellar/go/1.17.1/libexec/src/net/fd_posix.go:56 +0x51
net.(*conn).Read(0xc00011e000, {0xc0001fe000, 0x1000, 0x1000})
/usr/local/Cellar/go/1.17.1/libexec/src/net/net.go:183 +0xb1
bufio.(*Reader).fill(0xc0001981e0)
/usr/local/Cellar/go/1.17.1/libexec/src/bufio/bufio.go:101 +0x294
bufio.(*Reader).Peek(0xc0001981e0, 0x2)
/usr/local/Cellar/go/1.17.1/libexec/src/bufio/bufio.go:139 +0xcc
github.com/gorilla/websocket.(*Conn).read(0xc000268420, 0x63a2c20)
/Users/inanc/go/pkg/mod/github.com/gorilla/websocket@v1.4.2/conn.go:370 +0x5b
github.com/gorilla/websocket.(*Conn).advanceFrame(0xc000268420)
/Users/inanc/go/pkg/mod/github.com/gorilla/websocket@v1.4.2/conn.go:798 +0x125
github.com/gorilla/websocket.(*Conn).NextReader(0xc000268420)
/Users/inanc/go/pkg/mod/github.com/gorilla/websocket@v1.4.2/conn.go:980 +0x165
github.com/gorilla/websocket.(*Conn).ReadMessage(0xc000491ec8)
/Users/inanc/go/pkg/mod/github.com/gorilla/websocket@v1.4.2/conn.go:1064 +0x31
github.com/grafana/xk6-browser/testutils.CDPReadMsg(0xc00019c001)
/Users/inanc/xk6-browser/testutils/cdp.go:151 +0x45
github.com/grafana/xk6-browser/testutils.getWebsocketHandlerCDP.func1.1()
/Users/inanc/xk6-browser/testutils/cdp.go:79 +0xa5
created by github.com/grafana/xk6-browser/testutils.getWebsocketHandlerCDP.func1
/Users/inanc/xk6-browser/testutils/cdp.go:71 +0x2a5
I've removed the low prio
tag from this because it's such an issue that happens almost every time I commit, and I need to rerun the CI jobs over and over 🤷
I am also facing this issue in my local and it is happening almost 99% of the time even with one VU .
ERRO[0043] communicating with browser: websocket: close 1006 (abnormal closure): unexpected EOF category=cdp elapsed="0 ms" goroutine=5988 ERRO[0043] process with PID 28665 unexpectedly ended: signal: killed category=browser elapsed="1 ms" goroutine=5985
Below is my test script `/* init code - Load local files, import modules, declare lifecycle functions
/* init context: define custom function
Hey, @elimatripathy, could you try this again with the k6 version 0.43.1 and let us know if this error still happens?
Could you also let me know about your environment, like CPUs, memory size, OS, etc.?
thank you @inancgumus . Yes this issue is reproducible with the latest k6 version 0.43.1 as well and I tried many things to make sure that this is not happening due to my system limitations like running tests in both local and docker containers plus also running performance tests on the facebook.com site. Below are my findings
Script for facebook login:
import { check, sleep } from 'k6';
import { chromium } from 'k6/x/browser';
export const options = {
thresholds: {
checks: ["rate==1.0"]
},
ext: {
loadimpact: {
distribution: {
private: {
loadZone: 'amazon:ca:montreal',
percent: 50,
},
},
projectID:$enterTheProjectID,
},
},
scenarios: {
login: {
executor: 'per-vu-iterations',
vus: 35,
iterations: 1,
maxDuration: '1m'
}
},
}
export default async function() {
const browser = chromium.launch({
headless: __ENV.XK6_HEADLESS ? false : true,
});
const context = browser.newContext();
const page = context.newPage();
try {
// Goto front page, find login link and click it
await page.goto('https://www.facebook.com/', { waitUntil: 'networkidle' });
// page.waitForSelector('img[class="background15"]');
// Enter login credentials and login
page.locator('input[name="email"]').type('$');
page.locator('input[name="pass"]').type('$');
// We expect the form submission to trigger a navigation, so to prevent a
// race condition, setup a waiter concurrently while waiting for the click
// to resolve.
await Promise.all([
page.waitForNavigation(),
page.locator('button[type="submit"]').click(),
]);
sleep(1);
check(page, {
'header': find(page, '#facebook').isVisible()
});
} finally {
page.screenshot({ path: 'screenshot.png' });
page.close();
browser.close();
}
}
// Custom method to find web element
export function find(page,locator,waitTime) {
waitTime = waitTime || 10000;
let flag=false;
let iter=0;
let element;
while(flag==false){
try {
element = page.$(locator);
if(element.isVisible()){
flag=true;
}
} catch (error) {}
sleep(0.01);
iter++;
if(iter>waitTime){
break;
}
}
return element;
}
Running this script either in my local or docker containers with more than 35 virtual users caused the exception as below
This issue with xk6-browser is definitely stopping to test performance for any application fully . This should be prioritized or please let me know if anything I can do to fix it.
Below are my system configuration
Hi @elimatripathy, thanks for the detailed information. We'll consider this in our next planning.
This has happened again while running tests locally (v1.5.1):
--- FAIL: TestConnectionClosureAbnormal (0.00s)
--- FAIL: TestConnectionClosureAbnormal/closure_abnormal (0.01s)
connection_test.go:55:
Error Trace: /Users/inanc/grafana/k6browser/main/common/connection_test.go:55
Error: Error "read tcp 127.0.0.1:53920->127.0.0.1:53910: read: connection reset by peer" does not contain "websocket: close 1006 (abnormal closure): unexpected EOF"
Test: TestConnectionClosureAbnormal/closure_abnormal
We've seen intermittent failures of
TestConnectionClosureAbnormal/closure_abnormal
mostly in CI and ocasionally in local runs as well.The current test expects the error code
1006 (abnormal closure)
, but sometimes--typically under high load or in VMs running on overprovisioned hardware, as is usually the case for GH Actions--it will fail with code1001 (going away)
:We tried simply handling both cases in #33, but they're separate scenarios and the test should only fail with
1006
as that's what the/closure-abnormal
endpoint should cause.