eclipse / paho.golang

Go libraries
Other
327 stars 92 forks source link

An autopaho test TestQueuedMessage may be not passed. #242

Open tomatod opened 6 months ago

tomatod commented 6 months ago

Describe the bug I think this isn't a bug, but it could be reproduced in partical environments especially with low machine resources. So, as a attention, I reported this.

autopaho's main loop of connection manager is terminated in one second at TestQueuedMessage. The part of implemention is the following.

func TestQueuedMessages(t *testing.T) {
...
        // [tomatod's comment] longerDelay is one second
    ctx, cancel := context.WithTimeout(context.Background(), longerDelay)
    defer cancel()
    cm, err := NewConnection(ctx, config)
...
}

So, if your machine couldn't finish the test in one secode, the connection manager would be terminated and test would be failed in result.

To reproduce

(I often use this low spec machine outside...)

Debug output Hire is an example result of the above test command.

...
    test.go:53: 2024-02-17T16:54:18.467392298+09:00 test: innerCtx Done
    test.go:53: 2024-02-17T16:54:18.467846545+09:00 test: disconnecting &{<nil> 0}
    test.go:53: 2024-02-17T16:54:18.468537774+09:00 test: client stopping, incoming stopping
    test.go:53: 2024-02-17T16:54:18.468715317+09:00 test: returning from incoming worker
    test.go:53: 2024-02-17T16:54:18.46899835+09:00 test: returning from publish packets loop worker
    test.go:53: 2024-02-17T16:54:18.469265094+09:00 test: error called: EOF
    test.go:53: 2024-02-17T16:54:18.469104492+09:00 testServer: handleIncoming closed with error handleIncoming:ReadPacket: read tcp 127.0.0.1:46611->127.0.0.1:37874: use of closed network connection Remaining data: []
    test.go:53: 2024-02-17T16:54:18.469471831+09:00 test: returning from ping handler worker
    test.go:63: 2024-02-17T16:54:18.469598315+09:00 testServer:error closing ourConn: close tcp 127.0.0.1:46611->127.0.0.1:37874: use of closed network connection
    test.go:53: 2024-02-17T16:54:18.470503258+09:00 testServer: disconnected
    test.go:53: 2024-02-17T16:54:18.469983564+09:00 test: client stop requested
    test.go:53: 2024-02-17T16:54:18.471257796+09:00 test: conn closed
    test.go:53: 2024-02-17T16:54:18.471494756+09:00 test: acks tracker reset
    test.go:53: 2024-02-17T16:54:18.472294606+09:00 test: session updated, waiting on workers
    test.go:53: 2024-02-17T16:54:18.473805828+09:00 test: workers done
    test.go:63: 2024-02-17T16:54:18.474075876+09:00 test:handleError received extra error: EOF
    test.go:63: 2024-02-17T16:54:18.474838395+09:00 test:mainLoop: disconnect returned error: write tcp 127.0.0.1:37874->127.0.0.1:46611: write: broken pipe
    test.go:63: 2024-02-17T16:54:18.475043068+09:00 test:mainLoop: server connection handler exiting due to context: context deadline exceeded
    test.go:53: 2024-02-17T16:54:18.475369928+09:00 test: mainLoop: connection manager has terminated
    queue_test.go:208: timeout awaiting messages
--- FAIL: TestQueuedMessages (5.55s)

Expected behaviour Test is desired to pass without depending on environments.

MattBrittan commented 6 months ago

Thanks for raising this,

I'd appreciate it if someone else could run their eyes over these tests; they may be a bit fragile (have seen a very occasional failure) but were the best I could come up with at the time. I wanted to really exercise autopaho well as errors are most likely to surface when it's under stress (and the tests did reveal a couple of issues; since fixed).

I think the actual issue here is the context ctx, cancel := context.WithTimeout(context.Background(), longerDelay); this probably needs to be longer than a second on low powered kit. Would appreciate any thoughts re a better way of handling this (there will always be a potential for false alerts with time based tests like this when resources are very low; currently my only real thought is to increase the timeouts).

M.