nats-io / nats-server

High-Performance server for NATS.io, the cloud and edge native messaging system.
https://nats.io
Apache License 2.0
15.56k stars 1.39k forks source link

Slow message expiration on startup with lots of messages and active filtered consumer with wildcard #2380

Closed matthiashanel closed 6 months ago

matthiashanel commented 3 years ago

consumer in the stream that is recovered contains a filter_subject for a wildcard subjects ingress.raw.*

{
    "Created": "2021-07-22T19:38:12.377148796Z",
    "Name": "",
    "ack_policy": "all",
    "ack_wait": 30000000000,
    "deliver_policy": "all",
    "deliver_subject": "hh.ingress.raw",
    "durable_name": "ingresshh",
    "filter_subject": "ingress.raw.*",
    "max_ack_pending": 65536,
    "max_deliver": -1,
    "replay_policy": "instant"
}

The loop to expire messages in filestore.go expireMsgs() seems to take very long! In part this has to do with updating of consumer meta data, that is made even slower by the wildcard filter_subject

Side note: the signal handler for ctrl-c seems to be set up after executing this code. Maybe set up before, so this can be stopped without killing.

This screenshot shows the loop with tracing modifications, so the output below makes sense Screen Shot 2021-07-27 at 10 21 09 PM

This output is showing roughly 20K messages expired every two minutes. This is two traces on line 2196 of 10K each (line 2195) between unrelated traces that are 2 minutes apart. For context this is seqno ~3M with a last seqno of ~9M.

nats-server -c nats-config.yml -DV
[20596] 2021/07/27 22:13:02.337582 [INF] Starting nats-server
[20596] 2021/07/27 22:13:02.337710 [INF]   Version:  2.3.3-beta
[20596] 2021/07/27 22:13:02.337713 [INF]   Git:      [not set]
[20596] 2021/07/27 22:13:02.337715 [DBG]   Go build: go1.16
[20596] 2021/07/27 22:13:02.337717 [INF]   Name:     NAVWEONBDSA7EZNDL2RPTK5HFN3X77OUSL6CXUR5NHALM5F2KXSJHBWY
[20596] 2021/07/27 22:13:02.337723 [INF]   Node:     oLPJ9fMb
[20596] 2021/07/27 22:13:02.337726 [INF]   ID:       NAVWEONBDSA7EZNDL2RPTK5HFN3X77OUSL6CXUR5NHALM5F2KXSJHBWY
[20596] 2021/07/27 22:13:02.337730 [INF] Using configuration file: nats-config.yml
[20596] 2021/07/27 22:13:02.337732 [INF] Trusted Operators
[20596] 2021/07/27 22:13:02.337736 [INF]   System  : ""
[20596] 2021/07/27 22:13:02.337739 [INF]   Operator: "op1"
[20596] 2021/07/27 22:13:02.337762 [INF]   Issued  : 2021-07-20 06:49:20 -0400 EDT
[20596] 2021/07/27 22:13:02.337767 [INF]   Expires : 1969-12-31 19:00:00 -0500 EST
[20596] 2021/07/27 22:13:02.338545 [INF] Managing all jwt in exclusive directory /Users/matthiashanel/test/beenera-nats-jetstream/nats-account-jwt
[20596] 2021/07/27 22:13:02.338592 [INF] Starting JetStream
[20596] 2021/07/27 22:13:02.338876 [INF]     _ ___ _____ ___ _____ ___ ___   _   __  __
[20596] 2021/07/27 22:13:02.338884 [INF]  _ | | __|_   _/ __|_   _| _ \ __| /_\ |  \/  |
[20596] 2021/07/27 22:13:02.338887 [INF] | || | _|  | | \__ \ | | |   / _| / _ \| |\/| |
[20596] 2021/07/27 22:13:02.338889 [INF]  \__/|___| |_| |___/ |_| |_|_\___/_/ \_\_|  |_|
[20596] 2021/07/27 22:13:02.338891 [INF]
[20596] 2021/07/27 22:13:02.338893 [INF]          https://docs.nats.io/jetstream
[20596] 2021/07/27 22:13:02.338895 [INF]
[20596] 2021/07/27 22:13:02.338897 [INF] ---------------- JETSTREAM ----------------
[20596] 2021/07/27 22:13:02.338912 [INF]   Max Memory:      953.67 MB
[20596] 2021/07/27 22:13:02.338915 [INF]   Max Storage:     465.66 GB
[20596] 2021/07/27 22:13:02.338917 [INF]   Store Directory: "nats-jetstream-data/jetstream"
[20596] 2021/07/27 22:13:02.338920 [INF] -------------------------------------------
[20596] 2021/07/27 22:13:02.338987 [DBG]   Exports:
[20596] 2021/07/27 22:13:02.338991 [DBG]      $JS.API.>
[20596] 2021/07/27 22:13:02.339121 [DBG] Account [ABPSESBV2IITESFINX4SHX7NONYQSPNE7W5GGEZDJQ2NSZEVWZUK5VIM] fetch took 45.525µs
[20596] 2021/07/27 22:13:02.339384 [DBG] Updating account claims: ABPSESBV2IITESFINX4SHX7NONYQSPNE7W5GGEZDJQ2NSZEVWZUK5VIM/dummy
[20596] 2021/07/27 22:13:02.339415 [DBG] Enabled JetStream for account "ABPSESBV2IITESFINX4SHX7NONYQSPNE7W5GGEZDJQ2NSZEVWZUK5VIM"
[20596] 2021/07/27 22:13:02.339421 [DBG]   Max Memory:      -1 B
[20596] 2021/07/27 22:13:02.339424 [DBG]   Max Storage:     -1 B
[20596] 2021/07/27 22:13:02.339456 [DBG] Recovering JetStream state for account "ABPSESBV2IITESFINX4SHX7NONYQSPNE7W5GGEZDJQ2NSZEVWZUK5VIM"
[20596] 2021/07/27 22:13:02.359400 [INF]   Restored 1,122,182 messages for stream "from-unitizer"
[20596] 2021/07/27 22:13:02.359603 [INF]   Recovering 1 consumers for stream - "from-unitizer"
[20596] 2021/07/27 22:13:02.359943 [TRC] JETSTREAM - <<- [SUB $JSC.CI.ABPSESBV2IITESFINX4SHX7NONYQSPNE7W5GGEZDJQ2NSZEVWZUK5VIM.from-unitizer.sensor_t0047_DEMO_27497681_1DMO9900000001  46]
> seq 3840000 last seq 9941339, tsns 1626942716439691577 minage 1627092789045672000
> seq 3850000 last seq 9941339, tsns 1626942718549461572 minage 1627092789045672000
[20596] 2021/07/27 22:15:02.340274 [DBG] Checking store state: f8134942c29425f416277be57cf75ff1ec931beebe5141356aba39b448197456
> seq 3860000 last seq 9941339, tsns 1626942719776395012 minage 1627092789045672000
> seq 3870000 last seq 9941339, tsns 1626942722014510473 minage 1627092789045672000
[20596] 2021/07/27 22:17:02.341761 [DBG] Checking store state: f8134942c29425f416277be57cf75ff1ec931beebe5141356aba39b448197456
> seq 3880000 last seq 9941339, tsns 1626942724446950520 minage 1627092789045672000
> seq 3890000 last seq 9941339, tsns 1626942726660540439 minage 1627092789045672000
[20596] 2021/07/27 22:19:02.343006 [DBG] Checking store state: f8134942c29425f416277be57cf75ff1ec931beebe5141356aba39b448197456
> seq 3900000 last seq 9941339, tsns 1626942728481871192 minage 1627092789045672000
bruth commented 6 months ago

Without a test, hard to discern if this is fixed as of 2.10, but given how many changes and optimizations have occurred, closing for now.

derekcollison commented 6 months ago

Yes I optimized msg expiration on startup a long time ago.