nats-io / nats-server

High-Performance server for NATS.io, the cloud and edge native messaging system.
https://nats.io
Apache License 2.0
15.7k stars 1.4k forks source link

JetStream Cluster using embedded NATS server #4794

Open adoublef opened 10 months ago

adoublef commented 10 months ago

Observed behavior

I am failing to get a JetStream Cluster running with an embedded nats-server. The error I get when creating a KV Store is context deadline exceeded thus I am unsure what is the exact cause without going into the nats-server source code.

According to the offical docs here it should be possible to run NATS inside a Go binary.

Here it mentions how to setup a JetStream cluster, requiring at least 3 nodes to be connected due to the use of the RAFT algorithm.

I haven't seen any indication that this is not possible nor have I seen anything in the Slack channels or on issues of similar experiments.

Expected behavior

To run the same as the client/server version. I am using identical configurations thus I am unsure if there is anything extra I would need to look out for.

Ideally I would want to use the server.Option struct directly, but understanding how the config file maps to the struct proved to be a bit more challenging so for now will be happy to get the config file working.

Server and client version

Embedded into a Go binary (Go v1.21.3)

Host environment

WSL2 Debian 12 (via Windows 11 Home) Docker Engine: 24.0.6

Steps to reproduce

I have attached a link to the issue.

To reproduce, will just need to run the steps listed in the README.md

neilalexander commented 10 months ago

Certainly a NATS Server node should work the same whether it is standalone or embedded, although configuring via nats.Options is certainly more difficult.

This snippet works to start a cluster node that joins an existing cluster from the provided configuration file:

package main

import (
    "fmt"
    "time"

    natsserver "github.com/nats-io/nats-server/v2/server"
    natsclient "github.com/nats-io/nats.go"
)

func main() {
    var opts natsserver.Options
    if err := opts.ProcessConfigFile("/Users/neil/Workspace/nats-cluster/n1-c1.conf"); err != nil {
        panic(err)
    }

    server, err := natsserver.NewServer(&opts)
    if err != nil {
        panic(err)
    }

    server.ConfigureLogger()
    server.Start()
    if !server.ReadyForConnections(time.Second * 10) {
        panic("NATS server didn't start")
    }

    client, err := natsclient.Connect("", natsclient.InProcessServer(server))
    if err != nil {
        panic(err)
    }

    js, err := client.JetStream()
    if err != nil {
        panic(err)
    }

    kv, err := js.CreateKeyValue(&natsclient.KeyValueConfig{
        Bucket: "test",
    })
    if err != nil {
        panic(err)
    }

    for k, v := range map[string]string{
        "foo": "bar",
        "baz": "qux",
    } {
        if _, err := kv.PutString(k, v); err != nil {
            panic(err)
        }
    }

    keys, err := kv.Keys()
    if err != nil {
        panic(err)
    }

    fmt.Println("Found keys:", keys)

    server.WaitForShutdown()
}

Config file:

server_name=n1-c1
listen=4221
prof_port=6221
monitor_port=7221

jetstream {
   store_dir="."
}

accounts: {
  $SYS: {
    users: [{user: sys, password: sys}]
  }
}

cluster {
  name: C1
  listen: 127.0.0.1:5221
  routes: [
    nats-route://127.0.0.1:5222
    nats-route://127.0.0.1:5223
    nats-route://127.0.0.1:5224
    nats-route://127.0.0.1:5225
    nats-route://127.0.0.1:5226
    nats-route://127.0.0.1:5227
    nats-route://127.0.0.1:5228
    nats-route://127.0.0.1:5229
  ]
}
adoublef commented 10 months ago

Interesting, I will adapt to your main file and try run this inside a container. I imagine the issue with my KV helper is that I was doing the following:

func upsertKv(jsc nats.JetStreamContext, c *nats.KeyValueConfig) (nats.KeyValue, error) {
    if c == nil || c.Bucket == "" {
        return nil, errors.New("invalid config")
    }
    kv, err := jsc.KeyValue(c.Bucket)
    switch {
    case errors.Is(err, nats.ErrBucketNotFound):
        kv, err = jsc.CreateKeyValue(c)
        if err != nil {
            return nil, fmt.Errorf("create key value: %w", err)
        }
    case err != nil:
        return nil, fmt.Errorf("key value: %w", err)
    }
    return kv, nil
}

and this would resulting in the context deadline error. with the CreateKeyValue function I presume it will just return a kv if it already exists?

adoublef commented 10 months ago

Small bump, could the error potentially me my hardware? I tried using your example 1:1 but seems I still hit a context deadline exceed error.

An example log:

adoublef ➜ ~/personal/nats $ go run ./cmd/a
[38087] [INF] Starting nats-server
[38087] [INF]   Version:  2.10.6
[38087] [INF]   Git:      [not set]
[38087] [INF]   Cluster:  C1
[38087] [INF]   Name:     n1-c1
[38087] [INF]   Node:     oP5LzZ64
[38087] [INF]   ID:       NDD752ORWFSGA7VNB4HF4MQSQAO4Y2W7NYPDBBMUOWQ3SRVLR7G72QGO
[38087] [WRN] Plaintext passwords detected, use nkeys or bcrypt
[38087] [INF] profiling port: 6221
[38087] [INF] Using configuration file: ./etc/n1-c1.conf
[38087] [INF] Starting http monitor on 0.0.0.0:7221
[38087] [INF] Starting JetStream
[38087] [INF]     _ ___ _____ ___ _____ ___ ___   _   __  __
[38087] [INF]  _ | | __|_   _/ __|_   _| _ \ __| /_\ |  \/  |
[38087] [INF] | || | _|  | | \__ \ | | |   / _| / _ \| |\/| |
[38087] [INF]  \__/|___| |_| |___/ |_| |_|_\___/_/ \_\_|  |_|
[38087] [INF] 
[38087] [INF]          https://docs.nats.io/jetstream
[38087] [INF] 
[38087] [INF] ---------------- JETSTREAM ----------------
[38087] [INF]   Max Memory:      2.62 GB
[38087] [INF]   Max Storage:     702.99 GB
[38087] [INF]   Store Directory: "jetstream"
[38087] [INF] -------------------------------------------
[38087] [INF] Starting JetStream cluster
[38087] [INF] Creating JetStream metadata controller
[38087] [INF] JetStream cluster recovering state
[38087] [INF] Listening for client connections on 0.0.0.0:4221
[38087] [INF] Server is ready
[38087] [INF] Cluster name is C1
[38087] [INF] Listening for route connections on 0.0.0.0:5221
[38087] [WRN] JetStream has not established contact with a meta leader
[38087] [INF] 127.0.0.1:5222 - rid:7 - Route connection created
[38087] [INF] 127.0.0.1:5222 - rid:9 - Route connection created
[38087] [INF] 127.0.0.1:5223 - rid:10 - Route connection created
[38087] [INF] 127.0.0.1:5223 - rid:8 - Route connection created
[38087] [INF] 127.0.0.1:5223 - rid:11 - Route connection created
[38087] [INF] 127.0.0.1:5222 - rid:13 - Route connection created
[38087] [INF] 127.0.0.1:5223 - rid:14 - Route connection created
[38087] [WRN] Waiting for routing to be established...
[38087] [INF] 127.0.0.1:5222 - rid:15 - Route connection created
[38087] [INF] 127.0.0.1:39646 - rid:16 - Route connection created
[38087] [INF] 127.0.0.1:39638 - rid:17 - Route connection created
[38087] [INF] 127.0.0.1:39638 - rid:17 - Router connection closed: Duplicate Route
[38087] [INF] 127.0.0.1:39646 - rid:16 - Router connection closed: Client Closed
[38087] [INF] 127.0.0.1:39658 - rid:18 - Route connection created
[38087] [INF] 127.0.0.1:39648 - rid:19 - Route connection created
[38087] [INF] 127.0.0.1:39648 - rid:19 - Router connection closed: Client Closed
[38087] [INF] 127.0.0.1:39658 - rid:18 - Router connection closed: Duplicate Route
[38087] [INF] 127.0.0.1:5222 - rid:7 - Router connection closed: Client Closed
[38087] [INF] 127.0.0.1:5222 - rid:9 - Router connection closed: Client Closed
[38087] [INF] 127.0.0.1:5222 - rid:13 - Router connection closed: Client Closed
[38087] [INF] 127.0.0.1:5222 - rid:15 - Router connection closed: Client Closed
[38087] [ERR] Error trying to connect to route (attempt 1): dial tcp 127.0.0.1:5222: connect: connection refused
[38087] [ERR] Error trying to connect to route (attempt 1): dial tcp 127.0.0.1:5222: connect: connection refused
[38087] [ERR] Error trying to connect to route (attempt 1): dial tcp 127.0.0.1:5222: connect: connection refused
[38087] [ERR] Error trying to connect to route (attempt 1): dial tcp 127.0.0.1:5222: connect: connection refused
[38087] [INF] 127.0.0.1:5223 - rid:8 - Router connection closed: Client Closed
[38087] [INF] 127.0.0.1:5223 - rid:10 - Router connection closed: Client Closed
[38087] [INF] 127.0.0.1:5223 - rid:11 - Router connection closed: Client Closed
[38087] [INF] 127.0.0.1:5223 - rid:14 - Router connection closed: Client Closed
[38087] [ERR] Error trying to connect to route (attempt 2): dial tcp 127.0.0.1:5222: connect: connection refused
[38087] [ERR] Error trying to connect to route (attempt 2): dial tcp 127.0.0.1:5222: connect: connection refused
[38087] [ERR] Error trying to connect to route (attempt 2): dial tcp 127.0.0.1:5222: connect: connection refused
[38087] [ERR] Error trying to connect to route (attempt 2): dial tcp 127.0.0.1:5222: connect: connection refused
[38087] [ERR] Error trying to connect to route (attempt 1): dial tcp 127.0.0.1:5223: connect: connection refused
[38087] [ERR] Error trying to connect to route (attempt 1): dial tcp 127.0.0.1:5223: connect: connection refused
[38087] [ERR] Error trying to connect to route (attempt 1): dial tcp 127.0.0.1:5223: connect: connection refused
[38087] [ERR] Error trying to connect to route (attempt 1): dial tcp 127.0.0.1:5223: connect: connection refused
[38087] [ERR] Error trying to connect to route (attempt 3): dial tcp 127.0.0.1:5222: connect: connection refused
[38087] [ERR] Error trying to connect to route (attempt 3): dial tcp 127.0.0.1:5222: connect: connection refused
[38087] [ERR] Error trying to connect to route (attempt 3): dial tcp 127.0.0.1:5222: connect: connection refused
[38087] [ERR] Error trying to connect to route (attempt 3): dial tcp 127.0.0.1:5222: connect: connection refused
[38087] [ERR] Error trying to connect to route (attempt 2): dial tcp 127.0.0.1:5223: connect: connection refused
[38087] [ERR] Error trying to connect to route (attempt 2): dial tcp 127.0.0.1:5223: connect: connection refused
[38087] [ERR] Error trying to connect to route (attempt 2): dial tcp 127.0.0.1:5223: connect: connection refused
[38087] [ERR] Error trying to connect to route (attempt 2): dial tcp 127.0.0.1:5223: connect: connection refused
panic: context deadline exceeded

goroutine 1 [running]:
main.main()
        /home/adoublef/personal/nats/cmd/a/main.go:42 +0x3d4
exit status 2

When you say joins an existing cluster would this cluster be created with stand-alone nats-servers or all with embedded ones? As one of the logs mention being JetStream has not established a meta leader

adoublef commented 10 months ago

I looked closer at the repos main.go and seen that there was a call to a function called server.Run which wraps the server.Start function if Windows is detected, allows it to add a hook. would this by chance help?

adoublef commented 8 months ago

I just came across this project which seems to be using an embedded nats server for data replication. They are also setting a cluster, so will learn if there is anything i can gain from their codebase as what I wanted to achieve seems doable(?)

ripienaar commented 8 months ago

There are many examples on our tests for the server.

adoublef commented 8 months ago

There are many examples on our tests for the server.

I will have to look through those, as from my own example I was having issues. It does seem that this will only work with a config file currently.