nats-io / nats-server

High-Performance server for NATS.io, the cloud and edge native messaging system.
https://nats.io
Apache License 2.0
15.15k stars 1.37k forks source link

KeyValue : Can't GET or PUT from leaf node, however WATCH works fine. #4819

Open darkwatchuk opened 7 months ago

darkwatchuk commented 7 months ago

Observed behavior

It could be that I'm missing something , but I'm getting errors on the GET AND PUT commands when executed from a leaf node.

nats --context uat kv get kv2 a

    nats: error: context deadline exceeded

nats --context uat kv put kv2 a 1

   nats: error: nats: no response from stream

The watch command works fine.

nats --context uat kv watch kv2

[2023-11-27 08:30:32] PUT kv2 > c: 3 [2023-11-27 08:30:39] PUT kv2 > b: 2 [2023-11-27 08:30:56] PUT kv2 > a: 1

nats --context uat stream info KV_kv2

Information for Stream KV_kv2 created 2023-11-27 08:29:27

          Subjects: $KV.kv2.>
          Replicas: 3
          Storage: File

Options:

         Retention: Limits
   Acknowledgments: true
    Discard Policy: New
  Duplicate Window: 2m0s
        Direct Get: true
 Allows Msg Delete: false
      Allows Purge: true
    Allows Rollups: true

Limits:

  Maximum Messages: unlimited

Maximum Per Subject: 1 Maximum Bytes: unlimited Maximum Age: unlimited Maximum Message Size: unlimited Maximum Consumers: unlimited

Cluster Information:

              Name: sco-nats
            Leader: natsC
           Replica: natsA, current, seen 493ms ago
           Replica: natsB, current, seen 493ms ago

State:

          Messages: 3
             Bytes: 120 B
    First Sequence: 1 @ 2023-11-27 08:30:32 UTC
     Last Sequence: 3 @ 2023-11-27 08:30:56 UTC
  Active Consumers: 0
Number of Subjects: 3

Expected behavior

No errors?

Server and client version

Server Version : 2.10.5 Client Version : 0.1.1

Host environment

Hub Cluster : 3 Servers


leafnodes {
    listen: 0.0.0.0:7422
    no_advertise: true
}

jetstream {
  store_dir: /var/nats
  domain: hub
  max_memory_store: 64GB
  max_file_store: 300GB
}

system_account: SYS
accounts: {
    SYS: {
        users: [{user: sys, password: pass}]
    },
    UAT: {
        jetstream: enabled
        users: [
            {user: uat, password: uat}
        ]
    }
}

Leaf Node : 1 Server


leafnodes {
    no_advertise: true,
    remotes : [
    {
      account: "SYS",
      urls: [ "nats-leaf://sys:pass@xxx.xxx.xxx.xxx:7114", "nats-leaf://sys:pass@xxx.xxx.xxx.xxx:7115", "nats-leaf://sys:pass@xxx.xxx.xxx.xxx:7116" ]
    },
    {
      account: "UAT",
      urls: [ "nats-leaf://uat:uat@xxx.xxx.xxx.xxx:7114", "nats-leaf://uat:uat@xxx.xxx.xxx.xxx:7115", "nats-leaf://uat:uat@xxx.xxx.xxx.xxx:7116" ]
    },  
  ]  

}

jetstream {
  store_dir: /data
  domain: hub
  max_memory_store: 64GB
  max_file_store: 300GB
  extension_hint : will_extend
}
system_account: SYS
accounts: {
    SYS: {
        users: [{user: sys, password: pass}]
    },
    UAT: {
        jetstream: enabled
        users: [
            {user: leaf-uat, password: leaf-uat}
        ]
    }
}

LEAF NODE Start up

[1] 2023/11/27 08:45:30.327512 [INF] Starting JetStream cluster
[1] 2023/11/27 08:45:30.327521 [INF] Creating JetStream metadata controller
[1] 2023/11/27 08:45:30.327684 [INF] JetStream cluster recovering state
[1] 2023/11/27 08:45:30.327691 [INF] Turning JetStream metadata controller Observer Mode on - no previous contact
[1] 2023/11/27 08:45:30.327693 [INF] In cases where JetStream will not be extended
[1] 2023/11/27 08:45:30.327695 [INF] and waiting for leader election until first contact is not acceptable,
[1] 2023/11/27 08:45:30.327697 [INF] manually disable Observer Mode by setting the JetStream Option "extension_hint: no_extend"
[1] 2023/11/27 08:45:30.328020 [INF] Listening for client connections on 0.0.0.0:4222
[1] 2023/11/27 08:45:30.328161 [INF] Server is ready
[1] 2023/11/27 08:45:30.328660 [WRN] JetStream has not established contact with a meta leader
[1] 2023/11/27 08:45:30.342793 [INF] xxx.xxx.xxx.xxx:7116 - lid:10 - Leafnode connection created for account: UAT 
[1] 2023/11/27 08:45:30.344041 [INF] xxx.xxx.xxx.xxx:7116 - lid:13 - Leafnode connection created for account: SYS 
[1] 2023/11/27 08:45:30.391963 [INF] SYSTEM - Extending JetStream domain "hub" as System Account connected from server natsC/sco-nats
[1] 2023/11/27 08:45:30.429278 [WRN] Waiting for routing to be established...
[1] 2023/11/27 08:45:31.290820 [INF] JetStream cluster new metadata leader: natsC/sco-nats

Steps to reproduce

Make KV store on hub cluster 3 replicas.

Issue get and put commands on leaf node.

ripienaar commented 7 months ago

Please show nats output using --trace and details of your contexts that you use

darkwatchuk commented 7 months ago

Trace on the leaf node to the leaf node :

nats --trace --context uat kv get kv2 a

10:01:42 >>> $JS.API.STREAM.INFO.KV_kv2

10:01:42 <<< $JS.API.STREAM.INFO.KV_kv2: {"type":"io.nats.jetstream.api.v1.stream_info_response","total":0,"offset":0,"limit":0,"config":{"name":"KV_kv2","subjects":["$KV.kv2.\u003e"],"retention":"limits","max_consumers":-1,"max_msgs":-1,"max_bytes":-1,"max_age":0,"max_msgs_per_subject":1,"max_msg_size":-1,"discard":"new","storage":"file","num_replicas":3,"duplicate_window":120000000000,"placement":{},"compression":"none","allow_direct":true,"mirror_direct":false,"sealed":false,"deny_delete":true,"deny_purge":false,"allow_rollup_hdrs":true,"consumer_limits":{}},"created":"2023-11-27T08:29:27.773696407Z","state":{"messages":3,"bytes":120,"first_seq":1,"first_ts":"2023-11-27T08:30:32.134360947Z","last_seq":3,"last_ts":"2023-11-27T08:30:56.686393073Z","num_subjects":3,"consumer_count":0},"domain":"hub","cluster":{"name":"sco-nats","raft_group":"S-R3F-k15oUyt9","leader":"natsC","replicas":[{"name":"natsA","current":true,"active":122812461,"peer":"JU87ZsSL"},{"name":"natsB","current":true,"active":122750514,"peer":"wURfHZ9N"}]},"ts":"2023-11-27T10:01:42.618965666Z"}
10:01:42 >>> $JS.API.DIRECT.GET.KV_kv2.$KV.kv2.a

nats: error: context deadline exceeded

nats trace to the cluster (working)


D:\natsclient>nats --trace  --context uat kv get kv2 a
10:06:20 >>> $JS.API.STREAM.INFO.KV_kv2

10:06:20 <<< $JS.API.STREAM.INFO.KV_kv2: {"type":"io.nats.jetstream.api.v1.stream_info_response","total":0,"offset":0,"limit":0,"config":{"name":"KV_kv2","subjects":["$KV.kv2.\u003e"],"retention":"limits","max_consumers":-1,"max_msgs":-1,"max_bytes":-1,"max_age":0,"max_msgs_per_subject":1,"max_msg_size":-1,"discard":"new","storage":"file","num_replicas":3,"duplicate_window":120000000000,"placement":{},"compression":"none","allow_direct":true,"mirror_direct":false,"sealed":false,"deny_delete":true,"deny_purge":false,"allow_rollup_hdrs":true,"consumer_limits":{}},"created":"2023-11-27T08:29:27.773696407Z","state":{"messages":3,"bytes":120,"first_seq":1,"first_ts":"2023-11-27T08:30:32.134360947Z","last_seq":3,"last_ts":"2023-11-27T08:30:56.686393073Z","num_subjects":3,"consumer_count":0},"domain":"hub","cluster":{"name":"sco-nats","raft_group":"S-R3F-k15oUyt9","leader":"natsC","replicas":[{"name":"natsA","current":true,"active":785980351,"peer":"JU87ZsSL"},{"name":"natsB","current":true,"active":785928914,"peer":"wURfHZ9N"}]},"ts":"2023-11-27T10:06:21.282597453Z"}
10:06:20 >>> $JS.API.DIRECT.GET.KV_kv2.$KV.kv2.a

10:06:20 <<< $JS.API.DIRECT.GET.KV_kv2.$KV.kv2.a: 1
kv2 > a created @ 27 Nov 23 08:30 UTC

1

context at the leaf


{
  "description": "",
  "url": "nats://127.0.0.1:4223",
  "socks_proxy": "",
  "token": "",
  "user": "leaf-uat",
  "password": "leaf-uat",
  "creds": "",
  "nkey": "",
  "cert": "",
  "key": "",
  "ca": "",
  "nsc": "",
  "jetstream_domain": "hub",
  "jetstream_api_prefix": "",
  "jetstream_event_prefix": "",
  "inbox_prefix": "",
  "user_jwt": "",
  "color_scheme": ""
}

context used on the cluster


{
    "description": "",
    "url": "nats://10.0.17.114:4222,nats://10.0.17.115:4222,nats://10.0.17.116:4222",
    "token": "",
    "user": "uat",
    "password": "uat",
    "creds": "",
    "nkey": "",
    "cert": "",
    "key": "",
    "ca": "",
    "nsc": "",
    "jetstream_domain": "hub",
    "jetstream_api_prefix": "",
    "jetstream_event_prefix": "",
    "inbox_prefix": "",
    "user_jwt": ""
}

The cluster is behind a firewall (the IPs above are local) but ports are open and mapped and appear to work fine. The nats-leaf config uses the public ip addresses, and appears to connect ok... Pub/Sub both from cluster to leaf node works well, and also from leaf node to cluster works fine.

darkwatchuk commented 7 months ago

It seems that something is off somewhere - my understanding maybe?

So the configs listed above use the same domain but allow me to watch for changes in values, I just can't read or write values directly.

However......

If I change the leaf server to have a domain 'leaf' and a no_extend exension_hint, and change the leaf client's context domain to 'leaf', I can then seem to create a "mirrored kv" as follows.......

nats --context uat-leaf kv add kvm2 --mirror=kv-r3 --mirror-domain=hub

This seems to effectively create a mirrored kv store ("kvm2") on the leaf node, mirroring the central kv store ("kv-r3") on the hub. I can then issue gets and puts against the leaf's mirror and watch the leaf's mirror too. Put updates are now bi-directional.... so I can put on either the hub or the leaf and data moves both ways which is what I expected. If I take down the leaf mirror, change the hub kv and restart the mirror all updates whilst the leaf were off line all come through as expected.

Is this how it is all supposed to work? Is there anything around that I might have missed?

Maybe further (simpler?!) documentation around this area might be worth doing as this has taken quite a while to figure out..... however the power and flexibility this gives people is fantastic.