nats-io / nats-server

High-Performance server for NATS.io, the cloud and edge native messaging system.
https://nats.io
Apache License 2.0
15.5k stars 1.38k forks source link

Recovering consumer takes too long [v2.10.7 - v2.10.18] #5808

Open Ivlyth opened 3 weeks ago

Ivlyth commented 3 weeks ago

Observed behavior

[14452] 2024/08/20 14:42:14.995238 [INF] Starting nats-server
[14452] 2024/08/20 14:42:15.008306 [INF]   Version:  2.10.7
[14452] 2024/08/20 14:42:15.008318 [INF]   Git:      [fa8464d]
[14452] 2024/08/20 14:42:15.008326 [INF]   Name:     nats-01
[14452] 2024/08/20 14:42:15.008334 [INF]   Node:     pZeTdFV7
[14452] 2024/08/20 14:42:15.008338 [INF]   ID:       NBE3TNB4N3NCUFQHLSIRWOTN6I2I7INYJWVXQWMJ34QVT25S3UPCYYJG
[14452] 2024/08/20 14:42:15.008357 [WRN] Plaintext passwords detected, use nkeys or bcrypt
[14452] 2024/08/20 14:42:15.008392 [INF] Using configuration file: /opt/nats/conf/nats-server.conf
[14452] 2024/08/20 14:42:15.009861 [INF] Starting http monitor on 0.0.0.0:8222
[14452] 2024/08/20 14:42:15.009926 [INF] Starting JetStream
[14452] 2024/08/20 14:42:15.018372 [INF]     _ ___ _____ ___ _____ ___ ___   _   __  __
[14452] 2024/08/20 14:42:15.018391 [INF]  _ | | __|_   _/ __|_   _| _ \ __| /_\ |  \/  |
[14452] 2024/08/20 14:42:15.018397 [INF] | || | _|  | | \__ \ | | |   / _| / _ \| |\/| |
[14452] 2024/08/20 14:42:15.018401 [INF]  \__/|___| |_| |___/ |_| |_|_\___/_/ \_\_|  |_|
[14452] 2024/08/20 14:42:15.018406 [INF]
[14452] 2024/08/20 14:42:15.018411 [INF]          https://docs.nats.io/jetstream
[14452] 2024/08/20 14:42:15.018415 [INF]
[14452] 2024/08/20 14:42:15.018421 [INF] ---------------- JETSTREAM ----------------
[14452] 2024/08/20 14:42:15.018431 [INF]   Max Memory:      141.48 GB
[14452] 2024/08/20 14:42:15.018439 [INF]   Max Storage:     50.00 TB
[14452] 2024/08/20 14:42:15.018444 [INF]   Store Directory: "/data/nats-storage/jetstream"
[14452] 2024/08/20 14:42:15.018449 [INF] -------------------------------------------
[14452] 2024/08/20 14:42:15.057389 [INF]   Starting restore for stream '$G > event-action'
[14452] 2024/08/20 14:42:15.082226 [INF]   Restored 6 messages for stream '$G > event-action' in 25ms
[14452] 2024/08/20 14:42:15.090720 [INF]   Starting restore for stream '$G > event-base'
[14452] 2024/08/20 14:42:15.349544 [INF]   Restored 80,060,781 messages for stream '$G > event-base' in 259ms
[14452] 2024/08/20 14:42:15.350631 [INF]   Starting restore for stream '$G > event-transformed'
[14452] 2024/08/20 14:42:15.618664 [INF]   Restored 27,711,311 messages for stream '$G > event-transformed' in 268ms
[14452] 2024/08/20 14:42:15.618804 [INF]   Recovering 1 consumers for stream - '$G > event-action'
[14452] 2024/08/20 14:42:15.655177 [INF]   Recovering 1 consumers for stream - '$G > event-base'
[14452] 2024/08/20 14:48:06.249467 [INF]   Recovering 2 consumers for stream - '$G > event-transformed'
[14452] 2024/08/20 14:48:06.587266 [INF] Listening for client connections on 0.0.0.0:4222
[14452] 2024/08/20 14:48:06.587731 [INF] Server is ready
[14994] 2024/08/20 14:48:40.895462 [INF] Starting nats-server
[14994] 2024/08/20 14:48:40.895615 [INF]   Version:  2.10.18
[14994] 2024/08/20 14:48:40.895620 [INF]   Git:      [57d23ac]
[14994] 2024/08/20 14:48:40.895624 [INF]   Name:     nats-01
[14994] 2024/08/20 14:48:40.895628 [INF]   Node:     pZeTdFV7
[14994] 2024/08/20 14:48:40.895631 [INF]   ID:       NBCC3KDHZPOC6ZP4HMRFUF6BJ6C7RW735OFQOPMTBBT6JMI4ZFERV5LI
[14994] 2024/08/20 14:48:40.895641 [WRN] Plaintext passwords detected, use nkeys or bcrypt
[14994] 2024/08/20 14:48:40.895656 [INF] Using configuration file: /opt/nats/conf/nats-server.conf
[14994] 2024/08/20 14:48:40.896231 [INF] Starting http monitor on 0.0.0.0:8222
[14994] 2024/08/20 14:48:40.896280 [INF] Starting JetStream
[14994] 2024/08/20 14:48:40.896467 [INF]     _ ___ _____ ___ _____ ___ ___   _   __  __
[14994] 2024/08/20 14:48:40.896475 [INF]  _ | | __|_   _/ __|_   _| _ \ __| /_\ |  \/  |
[14994] 2024/08/20 14:48:40.896478 [INF] | || | _|  | | \__ \ | | |   / _| / _ \| |\/| |
[14994] 2024/08/20 14:48:40.896482 [INF]  \__/|___| |_| |___/ |_| |_|_\___/_/ \_\_|  |_|
[14994] 2024/08/20 14:48:40.896486 [INF]
[14994] 2024/08/20 14:48:40.896489 [INF]          https://docs.nats.io/jetstream
[14994] 2024/08/20 14:48:40.896506 [INF]
[14994] 2024/08/20 14:48:40.896509 [INF] ---------------- JETSTREAM ----------------
[14994] 2024/08/20 14:48:40.896514 [INF]   Max Memory:      141.48 GB
[14994] 2024/08/20 14:48:40.896517 [INF]   Max Storage:     50.00 TB
[14994] 2024/08/20 14:48:40.896519 [INF]   Store Directory: "/data/nats-storage/jetstream"
[14994] 2024/08/20 14:48:40.896522 [INF] -------------------------------------------
[14994] 2024/08/20 14:48:40.911069 [INF]   Starting restore for stream '$G > event-action'
[14994] 2024/08/20 14:48:40.947712 [INF]   Restored 6 messages for stream '$G > event-action' in 37ms
[14994] 2024/08/20 14:48:40.952908 [INF]   Starting restore for stream '$G > event-base'
[14994] 2024/08/20 14:48:41.179296 [INF]   Restored 80,060,781 messages for stream '$G > event-base' in 226ms
[14994] 2024/08/20 14:48:41.180456 [INF]   Starting restore for stream '$G > event-transformed'
[14994] 2024/08/20 14:48:41.457482 [INF]   Restored 27,711,311 messages for stream '$G > event-transformed' in 277ms
[14994] 2024/08/20 14:48:41.457581 [INF]   Recovering 1 consumers for stream - '$G > event-action'
[14994] 2024/08/20 14:48:41.490819 [INF]   Recovering 1 consumers for stream - '$G > event-base'
[14994] 2024/08/20 14:55:38.762911 [INF]   Recovering 2 consumers for stream - '$G > event-transformed'
[14994] 2024/08/20 14:55:39.022916 [INF] Listening for client connections on 0.0.0.0:4222
[14994] 2024/08/20 14:55:39.024199 [INF] Server is ready

in the log for version 2.10.7, we can see recovering consumer for stream event-transformed takes too long, about 5m30s and in the log for version 2.10.18, we can see the time is even long, it's about 7min

Expected behavior

when server restart, recovering the consumer not take so long

Server and client version

server: 2.10.7 and 2.10.18

Host environment

No response

Steps to reproduce

No response

Ivlyth commented 3 weeks ago
[root@xsiem bin]# nats consumer info
? Select a Stream event-base
? Select a Consumer actuator-transform
Information for Consumer event-base > actuator-transform created 2024-07-17T18:51:22+08:00

Configuration:

                    Name: actuator-transform
               Pull Mode: true
          Filter Subject: event-base.*
          Deliver Policy: New
              Ack Policy: Explicit
                Ack Wait: 30.00s
           Replay Policy: Instant
         Max Ack Pending: 20,000
       Max Waiting Pulls: 512

State:

  Last Delivered Message: Consumer sequence: 1,890,922,977 Stream sequence: 2,358,770,804
    Acknowledgment Floor: Consumer sequence: 1,890,907,977 Stream sequence: 2,358,755,804
        Outstanding Acks: 15,000 out of maximum 20,000
    Redelivered Messages: 0
    Unprocessed Messages: 42,350,781
           Waiting Pulls: 0 of maximum 512
Ivlyth commented 3 weeks ago
[root@xsiem bin]# nats consumer report
? Select a Stream event-base
╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                       Consumer report for event-base with 1 consumers                                      │
├────────────────────┬──────┬────────────┬──────────┬─────────────┬─────────────┬──────────────────┬───────────────┬─────────┤
│ Consumer           │ Mode │ Ack Policy │ Ack Wait │ Ack Pending │ Redelivered │ Unprocessed      │ Ack Floor     │ Cluster │
├────────────────────┼──────┼────────────┼──────────┼─────────────┼─────────────┼──────────────────┼───────────────┼─────────┤
│ actuator-transform │ Pull │ Explicit   │ 30.00s   │ 15,000      │ 0           │ 42,350,781 / 52% │ 2,358,755,804 │         │
╰────────────────────┴──────┴────────────┴──────────┴─────────────┴─────────────┴──────────────────┴───────────────┴─────────╯
wallyqs commented 3 weeks ago

how about the nats stream info from that consumer?

Ivlyth commented 3 weeks ago

I delete the data directory after i post this issue.

below is the current stream info

[root@xsiem ~]# nats stream info
? Select a Stream event-base
Information for Stream event-base created 2024-08-20 15:13:36

           Description: stream for event base (input)
              Subjects: event-base.>
              Replicas: 1
               Storage: File

Options:

             Retention: Limits
       Acknowledgments: true
        Discard Policy: Old
      Duplicate Window: 2m0s
     Allows Msg Delete: true
          Allows Purge: true
        Allows Rollups: false

Limits:

      Maximum Messages: unlimited
   Maximum Per Subject: unlimited
         Maximum Bytes: 500 GiB
           Maximum Age: unlimited
  Maximum Message Size: unlimited
     Maximum Consumers: unlimited

State:

              Messages: 65,145,993
                 Bytes: 370 GiB
        First Sequence: 1 @ 2024-08-20 15:14:03 UTC
         Last Sequence: 65,145,993 @ 2024-08-20 17:02:38 UTC
      Active Consumers: 1
    Number of Subjects: 7