TheThingsNetwork / lorawan-stack

The Things Stack, an Open Source LoRaWAN Network Server
https://www.thethingsindustries.com/stack/
Apache License 2.0
957 stars 302 forks source link

Flaky Network Server tests #3653

Closed neoaggelos closed 3 years ago

neoaggelos commented 3 years ago

Summary

Running the Network Server unit tests fails.

Steps to Reproduce

  1. git checkout v3.11
  2. docker run --rm -it -p 6379:6379 redis:5
  3. go test ./pkg/networkserver -failfast

What do you see now?

$ go test ./pkg/networkserver/redis/ -failfast
--- FAIL: TestDownlinkTaskQueue (0.01s)
    --- FAIL: TestDownlinkTaskQueue/1st_run (0.00s)
        downlink_task_queue.go:95: Pop returned on empty schedule with error: 
            -> error:pkg/redis:store (store error) (attributes: map[])
            --> ERR Error running script (call to f_32bd83583b1367d88c7534ddb33d6622a3c58cc7): @user_script:8: @user_script: 8: This Redis command is not allowed from scripts  (attributes: map[])
    downlink_task_queue_test.go:32: Skipping 2nd run
--- FAIL: TestApplicationUplinkQueue (0.01s)
    --- FAIL: TestApplicationUplinkQueue/1st_run (0.00s)
        application_uplink_queue.go:198: Received unexpected Pop error: error:pkg/redis:store (store error)
        doc.go:66: 
            /home/ubuntu/lorawan-stack/pkg/networkserver/internal/test/shared/application_uplink_queue.go:198
            Expected: true
            Actual:   false
    application_uplink_queue_test.go:32: Skipping 2nd run
redis: 2021/01/11 23:39:16 commands.go:36: specified duration is 64ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/01/11 23:39:16 commands.go:36: specified duration is 64ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/01/11 23:39:16 commands.go:36: specified duration is 64ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/01/11 23:39:16 commands.go:36: specified duration is 995ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/01/11 23:39:16 commands.go:36: specified duration is 64ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/01/11 23:39:16 commands.go:36: specified duration is 64ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/01/11 23:39:16 commands.go:36: specified duration is 64ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/01/11 23:39:16 commands.go:36: specified duration is 64ms, but minimal supported value is 1s - truncating to 1s
FAIL
FAIL    go.thethings.network/lorawan-stack/v3/pkg/networkserver/redis   0.270s
FAIL

CI also seems to fail randomly on a different test (see failure on https://github.com/TheThingsNetwork/lorawan-stack/runs/1684533672?check_suite_focus=true, but https://github.com/TheThingsNetwork/lorawan-stack/actions/runs/478857664 passes)

What do you want to see instead?

Tests should pass

Environment

Can you do this yourself and submit a Pull Request?

No

neoaggelos commented 3 years ago

Another CI failure here: https://github.com/TheThingsNetwork/lorawan-stack/pull/3586/checks?check_run_id=1684911932#step:14:143

``` ERROR Failed to generate class A downlink, skip class A downlink slot band_id=EU_863_870 dev_addr=42FFFFFF device_class=CLASS_A device_uid=process-downlink-test-app-id.process-downlink-test-dev-id downlink_type=data error=error:pkg/networkserver:no_downlink (no downlink to send) frequency_plan_id=EU_863_870 namespace=networkserver started_at=2021-01-12 00:18:00.3000015 +0000 UTC test_name=TestProcessDownlinkTask/Class_A/windows_open/1.0.3/RX1,RX2_available/no_MAC/generic_application_downlink/FCnt_too_low ERROR Failed to generate class A downlink, skip class A downlink slot band_id=EU_863_870 dev_addr=42FFFFFF device_class=CLASS_A device_uid=process-downlink-test-app-id.process-downlink-test-dev-id downlink_type=data error=error:pkg/networkserver:no_downlink (no downlink to send) frequency_plan_id=EU_863_870 namespace=networkserver started_at=2021-01-12 00:18:00.3000015 +0000 UTC test_name=TestProcessDownlinkTask/Class_A/windows_open/1.0.3/RX1,RX2_available/no_MAC/generic_application_downlink/application_downlink_exceeds_length_limit --- FAIL: TestProcessDownlinkTask (0.49s) --- FAIL: TestProcessDownlinkTask/join-accept/windows_open/RX1,RX2_available/active_session/EU868 (0.02s) networkserver_util_internal_test.go:1978: events.Publish call missed: &{ctx:0xc0009a5590 innerEvent:{Name:ns.up.join.accept.forward Time:{wall:692578894 ext:63746007475 loc:} Identifiers:[0xc000499980] Data: CorrelationIDs:[] Origin:fv-az184-39 Context:map[] Visibility:0xc0004998c0 Authentication: RemoteIP: UserAgent: UniqueID:01EVSZ9KFCGJEHMPXWJXBYM4TP XXX_NoUnkeyedLiteral:{} XXX_sizecache:0} data:0xc001289280 caller:} FAIL coverage: 37.9% of statements FAIL go.thethings.network/lorawan-stack/v3/pkg/networkserver 0.587s ```
htdvisser commented 3 years ago

This is a v3.11 release-blocking bug, as it breaks compatibility with the minimum version of Redis that we require (Redis 5.x).

From @adriansmares on Slack:

Redis 5.0 xreadgroup is marked as no-script https://github.com/redis/redis/blob/5.0/src/server.c#L315 "ws" are the flags, and s: command not allowed in scripts.

rvolosatovs commented 3 years ago

This is a v3.11 release-blocking bug, as it breaks compatibility with the minimum version of Redis that we require (Redis 5.x).

From @adriansmares on Slack:

Redis 5.0 xreadgroup is marked as no-script https://github.com/redis/redis/blob/5.0/src/server.c#L315 "ws" are the flags, and s: command not allowed in scripts.

We require Redis 6 from 3.11, so it's not a bug and this issue deals with tests anyway, not Redis versions

johanstokking commented 3 years ago

More:

 ERROR Failed to generate class A downlink, skip class A downlink slot band_id=EU_863_870 dev_addr=42FFFFFF device_class=CLASS_A device_uid=process-downlink-test-app-id.process-downlink-test-dev-id@foo-tenant downlink_type=data error=error:pkg/networkserver:no_downlink (no downlink to send) frequency_plan_id=EU_863_870 namespace=networkserver started_at=2021-01-25 09:13:01.6600015 +0000 UTC tenant_id=foo-tenant test_name=TestProcessDownlinkTask/Class_A/windows_open/1.0.3/RX1,RX2_available/no_MAC/generic_application_downlink/FCnt_too_low
 ERROR Failed to generate class A downlink, skip class A downlink slot band_id=EU_863_870 dev_addr=42FFFFFF device_class=CLASS_A device_uid=process-downlink-test-app-id.process-downlink-test-dev-id@foo-tenant downlink_type=data error=error:pkg/networkserver:no_downlink (no downlink to send) frequency_plan_id=EU_863_870 namespace=networkserver started_at=2021-01-25 09:13:01.6600015 +0000 UTC tenant_id=foo-tenant test_name=TestProcessDownlinkTask/Class_A/windows_open/1.0.3/RX1,RX2_available/no_MAC/generic_application_downlink/application_downlink_exceeds_length_limit
--- FAIL: TestProcessDownlinkTask (0.36s)
    --- FAIL: TestProcessDownlinkTask/join-accept/windows_open/RX1,RX2_available/active_session/EU868 (0.02s)
        networkserver_util_internal_test.go:1984: events.Publish call missed: &{ctx:0xc000a62e40 innerEvent:{Name:ns.up.join.accept.forward Time:{wall:139388693 ext:63747162782 loc:<nil>} Identifiers:[0xc000bdbfa0] Data:<nil> CorrelationIDs:[] Origin:fv-az76-674 Context:map[] Visibility:0xc000bdbf80 Authentication:<nil> RemoteIP: UserAgent: UniqueID:01EWWD2RDVMJ7G7CKP901QE9G5 XXX_NoUnkeyedLiteral:{} XXX_sizecache:0} data:0xc000338f00 caller:}
FAIL
coverage: 37.9% of statements
FAIL    go.thethings.network/lorawan-stack/v3/pkg/networkserver 0.455s
mattrude commented 3 years ago

I am receiving this error still after upgrading to version 3.11.0-rc2

This is running as a native application, it is not running within docker.

redis: 2021/02/09 21:53:11 commands.go:36: specified duration is 766ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 21:55:43 commands.go:36: specified duration is 732ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 22:01:52 commands.go:36: specified duration is 762ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 22:07:19 commands.go:36: specified duration is 769ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 22:10:20 commands.go:36: specified duration is 771ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 22:10:44 commands.go:36: specified duration is 796ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 22:16:05 commands.go:36: specified duration is 752ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 22:19:42 commands.go:36: specified duration is 730ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 22:19:50 commands.go:36: specified duration is 709ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 22:23:11 commands.go:36: specified duration is 765ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 22:25:43 commands.go:36: specified duration is 744ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 22:31:53 commands.go:36: specified duration is 766ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 22:37:19 commands.go:36: specified duration is 795ms, but minimal supported value is 1s - truncating to 1s

TTN Stack Version:

# ttn-lw-stack version
The Things Stack for LoRaWAN: ttn-lw-stack
Version:             3.11.0-rc2
Build date:          2021-02-09T09:54:26Z
Git commit:          65d285482
Go version:          go1.15.7
OS/Arch:             linux/arm

Redis Version:

# redis-server -v
Redis server v=6.0.6 sha=00000000:0 malloc=libc bits=32 build=ca474e00afe358bb

Ubuntu Version:

t# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.10
Release:        20.10
Codename:       groovy
mattrude commented 3 years ago

Shoot, I thought #3653 was in the 3.11.0-rc2 release, but it looks like it was not submitted at the time of the release, so this may be a duplicate error report.

rvolosatovs commented 3 years ago

I am receiving this error still after upgrading to version 3.11.0-rc2

This is running as a native application, it is not running within docker.

redis: 2021/02/09 21:53:11 commands.go:36: specified duration is 766ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 21:55:43 commands.go:36: specified duration is 732ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 22:01:52 commands.go:36: specified duration is 762ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 22:07:19 commands.go:36: specified duration is 769ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 22:10:20 commands.go:36: specified duration is 771ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 22:10:44 commands.go:36: specified duration is 796ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 22:16:05 commands.go:36: specified duration is 752ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 22:19:42 commands.go:36: specified duration is 730ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 22:19:50 commands.go:36: specified duration is 709ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 22:23:11 commands.go:36: specified duration is 765ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 22:25:43 commands.go:36: specified duration is 744ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 22:31:53 commands.go:36: specified duration is 766ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/09 22:37:19 commands.go:36: specified duration is 795ms, but minimal supported value is 1s - truncating to 1s

TTN Stack Version:

# ttn-lw-stack version
The Things Stack for LoRaWAN: ttn-lw-stack
Version:             3.11.0-rc2
Build date:          2021-02-09T09:54:26Z
Git commit:          65d285482
Go version:          go1.15.7
OS/Arch:             linux/arm

Redis Version:

# redis-server -v
Redis server v=6.0.6 sha=00000000:0 malloc=libc bits=32 build=ca474e00afe358bb

Ubuntu Version:

t# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.10
Release:        20.10
Codename:       groovy

This is coming from https://github.com/go-redis/redis/pull/1365 and is expected - it does not represent anything broken. We should probably find a way to hide this, but it does not seem these logging statements can be disabled in the library

johanstokking commented 3 years ago

New one:

--- FAIL: TestApplicationUplinkQueue (0.01s)
    --- FAIL: TestApplicationUplinkQueue/1st_run (0.00s)
        doc.go:66: 
            /home/runner/work/lorawan-stack/lorawan-stack/pkg/networkserver/internal/test/shared/application_uplink_queue.go:103
            Expected: true
            Actual:   false
        doc.go:66: 
            /home/runner/work/lorawan-stack/lorawan-stack/pkg/networkserver/internal/test/shared/application_uplink_queue.go:104
            Expected: true
            Actual:   false
        doc.go:66: 
            /home/runner/work/lorawan-stack/lorawan-stack/pkg/networkserver/internal/test/shared/application_uplink_queue.go:105
            Expected: nil
            Actual:   'error:pkg/networkserver/redis:limit_too_low (limit too low)'
        doc.go:66: 
            /home/runner/work/lorawan-stack/lorawan-stack/pkg/networkserver/internal/test/shared/application_uplink_queue.go:198
            Expected: true
            Actual:   false
    application_uplink_queue_test.go:32: Skipping 2nd run
redis: 2021/02/24 09:59:35 commands.go:36: specified duration is 512ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/24 09:59:35 commands.go:36: specified duration is 512ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/24 09:59:35 commands.go:36: specified duration is 512ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/24 09:59:35 commands.go:36: specified duration is 512ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/24 09:59:35 commands.go:36: specified duration is 512ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/24 09:59:35 commands.go:36: specified duration is 512ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/24 09:59:35 commands.go:36: specified duration is 512ms, but minimal supported value is 1s - truncating to 1s
FAIL
coverage: 59.7% of statements
FAIL    go.thethings.network/lorawan-stack/v3/pkg/networkserver/redis   1.153s
rvolosatovs commented 3 years ago

New one:

--- FAIL: TestApplicationUplinkQueue (0.01s)
    --- FAIL: TestApplicationUplinkQueue/1st_run (0.00s)
        doc.go:66: 
            /home/runner/work/lorawan-stack/lorawan-stack/pkg/networkserver/internal/test/shared/application_uplink_queue.go:103
            Expected: true
            Actual:   false
        doc.go:66: 
            /home/runner/work/lorawan-stack/lorawan-stack/pkg/networkserver/internal/test/shared/application_uplink_queue.go:104
            Expected: true
            Actual:   false
        doc.go:66: 
            /home/runner/work/lorawan-stack/lorawan-stack/pkg/networkserver/internal/test/shared/application_uplink_queue.go:105
            Expected: nil
            Actual:   'error:pkg/networkserver/redis:limit_too_low (limit too low)'
        doc.go:66: 
            /home/runner/work/lorawan-stack/lorawan-stack/pkg/networkserver/internal/test/shared/application_uplink_queue.go:198
            Expected: true
            Actual:   false
    application_uplink_queue_test.go:32: Skipping 2nd run
redis: 2021/02/24 09:59:35 commands.go:36: specified duration is 512ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/24 09:59:35 commands.go:36: specified duration is 512ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/24 09:59:35 commands.go:36: specified duration is 512ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/24 09:59:35 commands.go:36: specified duration is 512ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/24 09:59:35 commands.go:36: specified duration is 512ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/24 09:59:35 commands.go:36: specified duration is 512ms, but minimal supported value is 1s - truncating to 1s
redis: 2021/02/24 09:59:35 commands.go:36: specified duration is 512ms, but minimal supported value is 1s - truncating to 1s
FAIL
coverage: 59.7% of statements
FAIL  go.thethings.network/lorawan-stack/v3/pkg/networkserver/redis   1.153s

yes, TTES is broken, but not TTS. This is going to be closed by https://github.com/TheThingsNetwork/lorawan-stack/pull/3852