nats-io / nats-streaming-server

NATS Streaming System Server
https://nats.io
Apache License 2.0
2.51k stars 283 forks source link

Very low performance in FILE store #968

Open powerman opened 4 years ago

powerman commented 4 years ago

Comparing to https://github.com/nats-io/nats-streaming-server/issues/899 I see much worse FILE store performance, and I wonder is it normal or there is some regression?

Here is my docker-compose.yml:

version: "3.7"
services:
    nats:
        image: "nats:2.1.0"
        restart: always
        ports:
            - "14222:4222"

    stan:
        image: "nats-streaming:0.16.2"
        restart: always
        volumes:
            - "./stan:/data"
        command:
            - "--cluster_id=devenv"
            - "--nats_server=nats://nats:4222"
            - "--store=FILE"
            - "--dir=/data"

Results (2nd run is more adequate because subscriber isn't blocked by publisher this time, so we see real publishing rate which is less than 100/s):

$ stan-bench -s nats://127.0.0.1:14222 -c devenv -id bench -n 1000 -ms 1024 -np 1 -ns 1 bench                                                                        
Starting benchmark [msgs=1000, msgsize=1024, pubs=1, subs=1]
NATS Streaming Pub/Sub stats: 106 msgs/sec ~ 106.32 KB/sec
 Pub stats: 53 msgs/sec ~ 53.16 KB/sec
 Sub stats: 53 msgs/sec ~ 53.21 KB/sec
$ stan-bench -s nats://127.0.0.1:14222 -c devenv -id bench -n 1000 -ms 1024 -np 1 -ns 1 bench
Starting benchmark [msgs=1000, msgsize=1024, pubs=1, subs=1]
NATS Streaming Pub/Sub stats: 193 msgs/sec ~ 193.65 KB/sec
 Pub stats: 96 msgs/sec ~ 96.78 KB/sec
 Sub stats: 62,972 msgs/sec ~ 61.50 MB/sec

If we add extra options --file_sync=false --file_auto_sync=1ms publishing speed increases a lot (up to 5000 - see second run), but it's still much lower than in issue referenced above (65,580 msgs/sec):

$ stan-bench -s nats://127.0.0.1:14222 -c devenv -id bench -n 10000 -ms 1024 -np 1 -ns 1 bench2                                                                      
Starting benchmark [msgs=10000, msgsize=1024, pubs=1, subs=1]
NATS Streaming Pub/Sub stats: 4,487 msgs/sec ~ 4.38 MB/sec
 Pub stats: 2,243 msgs/sec ~ 2.19 MB/sec
 Sub stats: 2,244 msgs/sec ~ 2.19 MB/sec
$ stan-bench -s nats://127.0.0.1:14222 -c devenv -id bench -n 10000 -ms 1024 -np 1 -ns 1 bench2
Starting benchmark [msgs=10000, msgsize=1024, pubs=1, subs=1]
NATS Streaming Pub/Sub stats: 10,191 msgs/sec ~ 9.95 MB/sec
 Pub stats: 5,095 msgs/sec ~ 4.98 MB/sec
 Sub stats: 27,659 msgs/sec ~ 27.01 MB/sec

I belive my hardware is no worse (actually I suppose it's better) than Mac used in referenced tests: I've i7-2600K overclocked to 4.5GHz, 24 GB RAM, SSD.

powerman commented 4 years ago

SQL store (postgres) benchmarks shows (1.5-4 times) higher numbers than FILE store (but still much lower than in referenced issue).

Changes in docker-compose.yml:

    stan-postgres:
        image: "postgres:11.5"
        restart: always
        volumes:
            - {type: "tmpfs", target: "/dev/shm", tmpfs: {size: 268435456}}
            - "stan-postgres:/var/lib/postgresql/data"
        environment:
            POSTGRES_USER:                      user
            POSTGRES_PASSWORD:                  pass
            POSTGRES_DB:                        stan

    stan:
        image: "nats-streaming:0.16.2"
        restart: always
        command:
            - "--cluster_id=devenv"
            - "--nats_server=nats://nats:4222"
            - "--store=SQL"
            - "--sql_driver=postgres"
            - "--sql_source=postgres://user:pass@stan-postgres/stan?sslmode=disable"
            - "--sql_no_caching=true"
            - "--sql_max_open_conns=95" # of 100 (postgres default AFAIR)

Results:

$ stan-bench -s nats://127.0.0.1:14222 -c devenv -id bench -n 1000 -ms 1024 -np 1 -ns 1 bench                                                                         
Starting benchmark [msgs=1000, msgsize=1024, pubs=1, subs=1]
NATS Streaming Pub/Sub stats: 293 msgs/sec ~ 293.24 KB/sec
 Pub stats: 146 msgs/sec ~ 146.62 KB/sec
 Sub stats: 146 msgs/sec ~ 146.69 KB/sec
$ stan-bench -s nats://127.0.0.1:14222 -c devenv -id bench -n 1000 -ms 1024 -np 1 -ns 1 bench
Starting benchmark [msgs=1000, msgsize=1024, pubs=1, subs=1]
NATS Streaming Pub/Sub stats: 329 msgs/sec ~ 329.32 KB/sec
 Pub stats: 164 msgs/sec ~ 164.77 KB/sec
 Sub stats: 556 msgs/sec ~ 556.82 KB/sec

and now with --sql_no_caching=false:

$ stan-bench -s nats://127.0.0.1:14222 -c devenv -id bench -n 1000 -ms 1024 -np 1 -ns 1 bench2
Starting benchmark [msgs=1000, msgsize=1024, pubs=1, subs=1]
NATS Streaming Pub/Sub stats: 378 msgs/sec ~ 378.72 KB/sec
 Pub stats: 189 msgs/sec ~ 189.36 KB/sec
 Sub stats: 189 msgs/sec ~ 189.52 KB/sec
$ stan-bench -s nats://127.0.0.1:14222 -c devenv -id bench -n 1000 -ms 1024 -np 1 -ns 1 bench2
Starting benchmark [msgs=1000, msgsize=1024, pubs=1, subs=1]
NATS Streaming Pub/Sub stats: 814 msgs/sec ~ 814.26 KB/sec
 Pub stats: 407 msgs/sec ~ 407.46 KB/sec
 Sub stats: 9,977 msgs/sec ~ 9.74 MB/sec
kozlovic commented 4 years ago

Something is wrong here.. I get upward of 50,000 messages/sec pub/sub using the mounted volume and separate nats server, similar as you do.

Note that when you restart the bench test, the subscriber is by default requesting all available, which means that server is delivering results from previous run, which actually should make it slower. There is an -io to ignore old messages and basically start with new published messages.

Can you try with memory store (do not provide -store/-dir to see)? Also, you can try local to bypass docker and see if you also get poor results. (with same bench command, I am getting 189,000 msgs/sec with file store on my iMac).

powerman commented 4 years ago

MEMORY:

$ stan-bench -s nats://127.0.0.1:14222 -c devenv -id bench -n 10000 -ms 1024 -np 1 -ns 1 -io bench
Starting benchmark [msgs=10000, msgsize=1024, pubs=1, subs=1]
NATS Streaming Pub/Sub stats: 12,566 msgs/sec ~ 12.27 MB/sec
 Pub stats: 6,283 msgs/sec ~ 6.14 MB/sec
 Sub stats: 6,284 msgs/sec ~ 6.14 MB/sec
$ stan-bench -s nats://127.0.0.1:14222 -c devenv -id bench -n 10000 -ms 1024 -np 1 -ns 1 bench    
Starting benchmark [msgs=10000, msgsize=1024, pubs=1, subs=1]
NATS Streaming Pub/Sub stats: 13,737 msgs/sec ~ 13.42 MB/sec
 Pub stats: 6,887 msgs/sec ~ 6.73 MB/sec
 Sub stats: 407,749 msgs/sec ~ 398.19 MB/sec

At same time NATS benchmark (using same NATS server from docker-compose) shows millions - I mean, it's not an issue with docker or hardware:

$ nats-bench -s nats://127.0.0.1:14222 -n 10000000 -np 1 -ns 1 bench                                                                               
Starting benchmark [msgs=10000000, msgsize=128, pubs=1, subs=1]
NATS Pub/Sub stats: 4,709,489 msgs/sec ~ 574.89 MB/sec
 Pub stats: 2,355,060 msgs/sec ~ 287.48 MB/sec
 Sub stats: 2,354,839 msgs/sec ~ 287.46 MB/sec

Running it local to bypass docker isn't sounds like a good idea to me, because on production we use docker swarm and going to run NATS and NATS Streaming inside docker anyway.

kozlovic commented 4 years ago

At same time NATS benchmark (using same NATS server from docker-compose) shows millions - I mean, it's not an issue with docker or hardware:

You used default size which is 128. Use 1024, it does make a big difference it seems.

Running it local to bypass docker isn't sounds like a good idea to me, because on production we use docker swarm and going to run NATS and NATS Streaming inside docker anyway.

I understand but if local is good, then what can I do?

kozlovic commented 4 years ago

Also, the -io should be always or at the second run. Again, I don't understand how you get the number you get for the sub at the second run. That being said, number of messages is so low that results may be meaningless, but still, second run should require server to read from disk while in first run, data is likely in file store's mem cache (that last for 1sec).

powerman commented 4 years ago

Not such a big difference:

$ nats-bench -s nats://127.0.0.1:14222 -n 10000000 -ms 1024 -np 1 -ns 1 bench                                                                                   
Starting benchmark [msgs=10000000, msgsize=1024, pubs=1, subs=1]
NATS Pub/Sub stats: 1,460,716 msgs/sec ~ 1.39 GB/sec
 Pub stats: 730,360 msgs/sec ~ 713.24 MB/sec
 Sub stats: 730,368 msgs/sec ~ 713.25 MB/sec
$ nats-bench -s nats://127.0.0.1:14222 -n 10000000 -ms 1024 -np 1 -ns 0 bench
Starting benchmark [msgs=10000000, msgsize=1024, pubs=1, subs=0]
Pub stats: 2,042,747 msgs/sec ~ 1.95 GB/sec

Okay, I'll try to run local.

kozlovic commented 4 years ago

By the way, for stan-bench, are you running from master branch? It used to be that the default for sending messages was synchronous, and you had to specify -a to make it asynchronous. On master, the default is now async, and you have to specify -sync to make the publish synchronous. Please check running with -a to see if it complains or not (if it does it means you are running with version that accepts -sync instead of -a).

kozlovic commented 4 years ago

For info, on my iMac with Docker, for the nats bench, with message size of 1K I get only 88,000 msgs while you are getting 730,000!

iMacSynadia:nats.go ivan$ go run examples/nats-bench/main.go -np 1 -ns 1 -ms 1024 -n 100000 bench
Starting benchmark [msgs=100000, msgsize=1024, pubs=1, subs=1]
NATS Pub/Sub stats: 176,127 msgs/sec ~ 172.00 MB/sec
 Pub stats: 88,386 msgs/sec ~ 86.32 MB/sec
 Sub stats: 88,131 msgs/sec ~ 86.07 MB/sec

And with 128, perf increase but as you can see even less than you at ms 1024:

iMacSynadia:nats.go ivan$ go run examples/nats-bench/main.go -np 1 -ns 1 -ms 128 -n 100000 bench
Starting benchmark [msgs=100000, msgsize=128, pubs=1, subs=1]
NATS Pub/Sub stats: 1,117,574 msgs/sec ~ 136.42 MB/sec
 Pub stats: 569,279 msgs/sec ~ 69.49 MB/sec
 Sub stats: 560,872 msgs/sec ~ 68.47 MB/sec

Yet, my stan-bench run is much better..

powerman commented 4 years ago

I'm using latest tagged release (did go get github.com/nats-io/stan.go/examples/stan-bench few hours ago).

With -a numbers are much higher. MEMORY:

$ stan-bench -s nats://127.0.0.1:14222 -c devenv -id bench -n 1000000 -ms 1024 -np 1 -ns 1 -io -a bench
Starting benchmark [msgs=1000000, msgsize=1024, pubs=1, subs=1]
NATS Streaming Pub/Sub stats: 422,123 msgs/sec ~ 412.23 MB/sec
 Pub stats: 211,061 msgs/sec ~ 206.12 MB/sec
 Sub stats: 211,094 msgs/sec ~ 206.15 MB/sec

SQL with --sql_no_caching=false:

$ stan-bench -s nats://127.0.0.1:14222 -c devenv -id bench -n 100000 -ms 1024 -np 1 -ns 1 -io -a bench                                                               
Starting benchmark [msgs=100000, msgsize=1024, pubs=1, subs=1]
NATS Streaming Pub/Sub stats: 7,565 msgs/sec ~ 7.39 MB/sec
 Pub stats: 3,782 msgs/sec ~ 3.69 MB/sec
 Sub stats: 3,783 msgs/sec ~ 3.69 MB/sec

SQL with --sql_no_caching=true:

$ stan-bench -s nats://127.0.0.1:14222 -c devenv -id bench -n 2000 -ms 1024 -np 1 -ns 1 -io -a bench                                                               
Starting benchmark [msgs=2000, msgsize=1024, pubs=1, subs=1]
NATS Streaming Pub/Sub stats: 420 msgs/sec ~ 420.96 KB/sec
 Pub stats: 210 msgs/sec ~ 210.48 KB/sec
 Sub stats: 210 msgs/sec ~ 210.62 KB/sec

FILE with --file_sync=false:

$ stan-bench -s nats://127.0.0.1:14222 -c devenv -id bench -n 100000 -ms 1024 -np 1 -ns 1 -io -a bench
Starting benchmark [msgs=100000, msgsize=1024, pubs=1, subs=1]
NATS Streaming Pub/Sub stats: 63,656 msgs/sec ~ 62.16 MB/sec
 Pub stats: 31,828 msgs/sec ~ 31.08 MB/sec
 Sub stats: 31,835 msgs/sec ~ 31.09 MB/sec

FILE with --file_sync=true:

$ stan-bench -s nats://127.0.0.1:14222 -c devenv -id bench -n 100000 -ms 1024 -np 1 -ns 1 -io -a bench                                                               
Starting benchmark [msgs=100000, msgsize=1024, pubs=1, subs=1]
NATS Streaming Pub/Sub stats: 27,616 msgs/sec ~ 26.97 MB/sec
 Pub stats: 13,808 msgs/sec ~ 13.48 MB/sec
 Sub stats: 13,849 msgs/sec ~ 13.53 MB/sec

But… if I understand correctly what -a does (doesn't wait for Ack after publishing before sending next message) then it's not the way we're planning to use it, because we wanna ensure order of messages (at least for some channels/subjects) and thus have to wait for Ack.

powerman commented 4 years ago

Local (I've started local only NATS Streaming, it's configured to use NATS in docker) shows similar numbers. FILE with --file_sync=true:

~/download/nats-streaming-server-v0.16.2-linux-amd64 $ ./nats-streaming-server --cluster_id=devenv --nats_server=nats://localhost:14222 --store=FILE --dir=./stan --file_auto_sync=1ms --file_sync=true
[24842] 2019/11/14 01:48:42.568430 [INF] STREAM: Starting nats-streaming-server[devenv] version 0.16.2
[24842] 2019/11/14 01:48:42.568458 [INF] STREAM: ServerID: 3mGZFbeVOORvi9xmB9QmwV
[24842] 2019/11/14 01:48:42.568464 [INF] STREAM: Go version: go1.11.13
[24842] 2019/11/14 01:48:42.568468 [INF] STREAM: Git commit: [910d6e1]
[24842] 2019/11/14 01:48:42.570974 [INF] STREAM: Recovering the state...
[24842] 2019/11/14 01:48:42.617891 [INF] STREAM: Recovered 1 channel(s)
[24842] 2019/11/14 01:48:42.868358 [INF] STREAM: Message store is FILE
[24842] 2019/11/14 01:48:42.868370 [INF] STREAM: Store location: ./stan
[24842] 2019/11/14 01:48:42.868390 [INF] STREAM: ---------- Store Limits ----------
[24842] 2019/11/14 01:48:42.868393 [INF] STREAM: Channels:                  100 *
[24842] 2019/11/14 01:48:42.868395 [INF] STREAM: --------- Channels Limits --------
[24842] 2019/11/14 01:48:42.868396 [INF] STREAM:   Subscriptions:          1000 *
[24842] 2019/11/14 01:48:42.868398 [INF] STREAM:   Messages     :       1000000 *
[24842] 2019/11/14 01:48:42.868400 [INF] STREAM:   Bytes        :     976.56 MB *
[24842] 2019/11/14 01:48:42.868401 [INF] STREAM:   Age          :     unlimited *
[24842] 2019/11/14 01:48:42.868403 [INF] STREAM:   Inactivity   :     unlimited *
[24842] 2019/11/14 01:48:42.868405 [INF] STREAM: ----------------------------------
[24842] 2019/11/14 01:48:42.868406 [INF] STREAM: Streaming Server is ready

Results (with and without -a):

$ stan-bench -s nats://127.0.0.1:14222 -c devenv -id bench -n 100000 -ms 1024 -np 1 -ns 1 -io -a bench                                                                
Starting benchmark [msgs=100000, msgsize=1024, pubs=1, subs=1]
NATS Streaming Pub/Sub stats: 47,296 msgs/sec ~ 46.19 MB/sec
 Pub stats: 23,648 msgs/sec ~ 23.09 MB/sec
 Sub stats: 23,684 msgs/sec ~ 23.13 MB/sec
$ stan-bench -s nats://127.0.0.1:14222 -c devenv -id bench -n 1000 -ms 1024 -np 1 -ns 1 -io bench   
Starting benchmark [msgs=1000, msgsize=1024, pubs=1, subs=1]
NATS Streaming Pub/Sub stats: 194 msgs/sec ~ 194.19 KB/sec
 Pub stats: 97 msgs/sec ~ 97.10 KB/sec
 Sub stats: 97 msgs/sec ~ 97.17 KB/sec

FILE with --file_sync=false:

~/download/nats-streaming-server-v0.16.2-linux-amd64 $ ./nats-streaming-server --cluster_id=devenv --nats_server=nats://localhost:14222 --store=FILE --dir=./stan --file_auto_sync=1ms --file_sync=false
[26848] 2019/11/14 01:53:46.533392 [INF] STREAM: Starting nats-streaming-server[devenv] version 0.16.2
[26848] 2019/11/14 01:53:46.533418 [INF] STREAM: ServerID: 3p7BkrGNIUDtia9IJ3n3tF
[26848] 2019/11/14 01:53:46.533421 [INF] STREAM: Go version: go1.11.13
[26848] 2019/11/14 01:53:46.533424 [INF] STREAM: Git commit: [910d6e1]
[26848] 2019/11/14 01:53:46.535438 [INF] STREAM: Recovering the state...
[26848] 2019/11/14 01:53:46.645970 [INF] STREAM: Recovered 1 channel(s)
[26848] 2019/11/14 01:53:46.896387 [INF] STREAM: Message store is FILE
[26848] 2019/11/14 01:53:46.896402 [INF] STREAM: Store location: ./stan
[26848] 2019/11/14 01:53:46.896429 [INF] STREAM: ---------- Store Limits ----------
[26848] 2019/11/14 01:53:46.896434 [INF] STREAM: Channels:                  100 *
[26848] 2019/11/14 01:53:46.896438 [INF] STREAM: --------- Channels Limits --------
[26848] 2019/11/14 01:53:46.896443 [INF] STREAM:   Subscriptions:          1000 *
[26848] 2019/11/14 01:53:46.896449 [INF] STREAM:   Messages     :       1000000 *
[26848] 2019/11/14 01:53:46.896453 [INF] STREAM:   Bytes        :     976.56 MB *
[26848] 2019/11/14 01:53:46.896458 [INF] STREAM:   Age          :     unlimited *
[26848] 2019/11/14 01:53:46.896461 [INF] STREAM:   Inactivity   :     unlimited *
[26848] 2019/11/14 01:53:46.896465 [INF] STREAM: ----------------------------------
[26848] 2019/11/14 01:53:46.896469 [INF] STREAM: Streaming Server is ready

Results:

$ stan-bench -s nats://127.0.0.1:14222 -c devenv -id bench -n 100000 -ms 1024 -np 1 -ns 1 -io -a bench
Starting benchmark [msgs=100000, msgsize=1024, pubs=1, subs=1]
NATS Streaming Pub/Sub stats: 105,691 msgs/sec ~ 103.21 MB/sec
 Pub stats: 104,331 msgs/sec ~ 101.89 MB/sec
 Sub stats: 52,865 msgs/sec ~ 51.63 MB/sec
$ stan-bench -s nats://127.0.0.1:14222 -c devenv -id bench -n 10000 -ms 1024 -np 1 -ns 1 -io bench                                                                  
Starting benchmark [msgs=10000, msgsize=1024, pubs=1, subs=1]
NATS Streaming Pub/Sub stats: 4,672 msgs/sec ~ 4.56 MB/sec
 Pub stats: 2,336 msgs/sec ~ 2.28 MB/sec
 Sub stats: 2,336 msgs/sec ~ 2.28 MB/sec
kozlovic commented 4 years ago

I'm using latest tagged release (did go get github.com/nats-io/stan.go/examples/stan-bench few hours ago).

OK, something is wrong again here.. If you really take from master, -a should be rejected:

stan-bench -s nats://127.0.0.1:14222 -c devenv -id bench -n 1000000 -ms 1024 -np 1 -ns 1 -io -a bench
flag provided but not defined: -a
Usage: stan-bench [-s server (nats://127.0.0.1:4222)] [-c CLUSTER_ID] [-id CLIENT_ID] [-qgroup QUEUE_GROUP_NAME] [-np NUM_PUBLISHERS] [-ns NUM_SUBSCRIBERS] [-n NUM_MSGS] [-ms MESSAGE_SIZE] [-csv csvfile] [-mpa MAX_NUMBER_OF_PUBLISHED_ACKS_INFLIGHT] [-io] [-sync] [--creds credentials_file] <subject>

if I understand correctly what -a does (doesn't wait for Ack after publishing before sending next message) then it's not the way we're planning to use it, because we wanna ensure order of messages (at least for some channels/subjects) and thus have to wait for Ack.

A given connection published messages will be ordered regardless of the sync/async mode. The difference is that in sync mode, a publish is a request/reply, so indeed next message is not sent before the pub ack is received. With async, the library may send message 1, 2, 3, etc.. and server will process/store and return acks 1, 2, 3, etc.. But yes, there is a risk that if for some reason some messages are rejected, you would have the risk of having a "gap" in your message flow: 1, 2, 3 accepted, 4 lost or rejected, 5, 6 accepted.

Sync publish is expected to be very slow (req/reply). You can get better results by having multiple connections doing publish, but from a single publisher, you will be limited for sure. That would be the same with core NATS, try do do req/reply from single producer and you'll likely to be limited to ~10,000 msgs/sec or so.

powerman commented 4 years ago

Maybe I'm missing something, but to me it looks like if I need reliability and keep order then I should look at results without -a and with --file_sync=true - which gives publishing rate a bit less than 100/s!

I'm using latest tagged release (did go get github.com/nats-io/stan.go/examples/stan-bench few hours ago).

OK, something is wrong again here.. If you really take from master, -a should be rejected:

I suppose "latest tagged" still has -a and it's older than current master. I didn't tried latest master yet… should I?

But yes, there is a risk that if for some reason some messages are rejected, you would have the risk of having a "gap" in your message flow: 1, 2, 3 accepted, 4 lost or rejected, 5, 6 accepted.

Can you give an example how/why 4th may be "lost or rejected"? I suppose it may be "lost" if NATS loses it and doesn't deliver to NATS Streaming, but how it may be rejected?

In any case, as I've already mentioned, "4 lost" isn't an acceptable for most of our channel/subject, we need some reliability here… but 100/160 (FILE/SQL --sql_no_caching=true) msg/s sounds too low anyway. :disappointed: Probably the best will be use SQL --sql_no_caching=false and get 400 msg/s (if I understand correctly this way some subscriber's Ack on durable subscriptions may be lost on server crash, but published and Ack'ed messages must not be lost in this case).

So. What's the resume? Am I understand this correctly and these 100/160/400 pub msg/s is the best I can get from reliable way to use NATS Streaming?

kozlovic commented 4 years ago

I suppose "latest tagged" still has -a and it's older than current master. I didn't tried latest master yet… should I?

You are right, the change is in master and has not been released yet, so that may explain. You don't have to try from master, I was just puzzled since -a is no longer accepted and is the default.

Can you give an example how/why 4th may be "lost or rejected"? I suppose it may be "lost" if NATS loses it and doesn't deliver to NATS Streaming, but how it may be rejected?

As you understand, the 4th message could be lost - disconnection, nats route in between disconnected, etc.. - and so the streaming server would not even receive it but receive message 5 (unlikely that this would happen for a single message, but you get the idea) . Rejected may have been a bad term, I am not sure this could really happen. I am thinking clustering mode where if not enough followers would ack then the leader would return an error, but if that is the case it is likely that the leader lost leadership and I don't think that following messages would even be processed.

In any case, as I've already mentioned, "4 lost" isn't an acceptable for most of our channel/subject,

Understood.

So. What's the resume? Am I understand this correctly and these 100/160/400 pub msg/s is the best I can get from reliable way to use NATS Streaming?

I am at a loss. I know that on my iMac I get obviously much better results. With Docker unfortunately messages/sec falls but nothing close to the poor numbers you get. I have ran tests on a Linux VM (Virtual Box) and even then I get ~100,000 msgs/sec async and ~2,333 msgs/sec sync. You could try running the local version without the link to the NATS Server docker image (don't provide -nats-server and by default streaming server will embed a nats server and run on port 4222. You can specify which port to use with -p). This would be to see if you get better results. But I understand that goal would be to run on Docker.

That being said: if you need sync publish (1 message at a time) and a file sync per message, this is going to be slow, regardless what tech you use. As mentioned before you can get overall improved performance I guess if you have multiple connections publishing since at least the server may be able to "pack" multiple records in the file write and 1 file sync for multiple messages effectively.

powerman commented 4 years ago

I've tried our most-likely-real-world configuration (1 sync publisher, 5 subscribers) using local with embedded NATS:

~/download/nats-streaming-server-v0.16.2-linux-amd64 $ ./nats-streaming-server --cluster_id=devenv --store=FILE --dir=./stan --file_auto_sync=1ms --file_sync=true

$ stan-bench -s nats://127.0.0.1:4222 -c devenv -id bench -n 1000 -ms 128 -np 1 -ns 5 -io bench                                                                      
Starting benchmark [msgs=1000, msgsize=128, pubs=1, subs=5]
NATS Streaming Pub/Sub stats: 574 msgs/sec ~ 71.83 KB/sec
 Pub stats: 95 msgs/sec ~ 11.97 KB/sec
 Sub stats: 479 msgs/sec ~ 59.91 KB/sec
  [1] 95 msgs/sec ~ 11.98 KB/sec (1000 msgs)
  [2] 95 msgs/sec ~ 11.98 KB/sec (1000 msgs)
  [3] 95 msgs/sec ~ 11.98 KB/sec (1000 msgs)
  [4] 95 msgs/sec ~ 11.98 KB/sec (1000 msgs)
  [5] 95 msgs/sec ~ 11.98 KB/sec (1000 msgs)
  min 95 | avg 95 | max 95 | stddev 0 msgs

This is 2 times better than same configuration (I mean FILE, not embedded NATS) in docker (Pub stats: 53 msgs/sec).

Also I've tried local NATS Streaming with embedded NATS using SQL (but postgres is still running in Docker) - Pub stats: 80 msgs/sec both this way and with NATS Streaming without embedded NATS in docker.

All these numbers are very disappointing. I'm going to try clustering mode in swarm in next few days, maybe cluster will show better performance, but I'm not really optimistic about this. :cry:

Can you please show me exact commands you've used in VirtualBox - I'll try it this way, also in VirtualBox, just in case. Maybe something is wrong with my OS (Gentoo Linux).

kozlovic commented 4 years ago

From my iMac, I run "Docker Quickstart Terminal" which opens a window similar to this:

                        ##         .
                  ## ## ##        ==
               ## ## ## ## ##    ===
           /"""""""""""""""""\___/ ===
      ~~~ {~~ ~~~~ ~~~ ~~~~ ~~~ ~ /  ===- ~~~
           \______ o           __/
             \    \         __/
              \____\_______/

docker is configured to use the default machine with IP 192.168.99.100
For help getting started, check out the docs at https://docs.docker.com

Then I run a docker image of ubuntu that I have to test/dev the NATS C client that I maintain. I also used that to test the NATS Streaming/Core servers locally (we also run on Linux for CI integration (Travis CI)):

iMacSynadia:ubuntu_dev ivan$ docker run -ti -v ~/dev/:/home/ivan nats_ubuntu
root@9752a1669316:/home/ivan# source test/ubuntu_dev/setupgo.sh 12
go version go1.12.12 linux/amd64

The above is just setting my go env. Then I run the server. Note that the datastore will be actually stored in the host (iMac).

root@9752a1669316:/home/ivan/go/src/github.com/nats-io/nats-streaming-server# /home/ivan/test/ubuntu_dev/nats-streaming-server -store file -dir datastore &
[1] 23
root@9752a1669316:/home/ivan/go/src/github.com/nats-io/nats-streaming-server# [23] 2019/11/14 00:57:26.413314 [INF] STREAM: Starting nats-streaming-server[test-cluster] version 0.16.2
[23] 2019/11/14 00:57:26.413404 [INF] STREAM: ServerID: sCmXJrjQAkxCb67md23ap6
[23] 2019/11/14 00:57:26.413421 [INF] STREAM: Go version: go1.11.13
[23] 2019/11/14 00:57:26.413423 [INF] STREAM: Git commit: [910d6e1]
[23] 2019/11/14 00:57:26.422409 [INF] Starting nats-server version 2.0.4
[23] 2019/11/14 00:57:26.422425 [INF] Git commit [c8ca58e]
[23] 2019/11/14 00:57:26.423976 [INF] Listening for client connections on 0.0.0.0:4222
[23] 2019/11/14 00:57:26.423989 [INF] Server id is NAMCWP2M3SGVZADWZWU3SSF2SA4XHAV7H3KLEGRADYT6ZENI2ERK4Y62
[23] 2019/11/14 00:57:26.423992 [INF] Server is ready
[23] 2019/11/14 00:57:26.453946 [INF] STREAM: Recovering the state...
[23] 2019/11/14 00:57:26.455300 [INF] STREAM: No recovered state
[23] 2019/11/14 00:57:26.716714 [INF] STREAM: Message store is FILE
[23] 2019/11/14 00:57:26.716760 [INF] STREAM: Store location: datastore
[23] 2019/11/14 00:57:26.717694 [INF] STREAM: ---------- Store Limits ----------
[23] 2019/11/14 00:57:26.717708 [INF] STREAM: Channels:                  100 *
[23] 2019/11/14 00:57:26.717710 [INF] STREAM: --------- Channels Limits --------
[23] 2019/11/14 00:57:26.717711 [INF] STREAM:   Subscriptions:          1000 *
[23] 2019/11/14 00:57:26.717712 [INF] STREAM:   Messages     :       1000000 *
[23] 2019/11/14 00:57:26.717714 [INF] STREAM:   Bytes        :     976.56 MB *
[23] 2019/11/14 00:57:26.717715 [INF] STREAM:   Age          :     unlimited *
[23] 2019/11/14 00:57:26.717716 [INF] STREAM:   Inactivity   :     unlimited *
[23] 2019/11/14 00:57:26.717717 [INF] STREAM: ----------------------------------
[23] 2019/11/14 00:57:26.717719 [INF] STREAM: Streaming Server is ready

Note that I don't specify the options --file_auto_sync=1ms --file_sync=true because by default the server sync on each write. The auto sync is more for when file_sync=false but will do background sync at given interval. So by default we do sync each write but by default file store does use buffering so if server is processing multiple connections as I explained before it may "pack" multiple records for given channel in buffer that is flushed and fsync'ed before returning ok to producer(s).

To run the stan-bench:

root@9752a1669316:/home/ivan/go/src/github.com/nats-io/nats-streaming-server# cd ../stan.go/
root@9752a1669316:/home/ivan/go/src/github.com/nats-io/stan.go# go run examples/stan-bench/main.go -s nats://127.0.0.1:4222 -id bench -n 1000 -ms 128 -np 1 -ns 5 -io bench
[23] 2019/11/14 00:58:16.658318 [INF] STREAM: Channel "bench" has been created
Starting benchmark [msgs=1000, msgsize=128, pubs=1, subs=5]
NATS Streaming Pub/Sub stats: 307,200 msgs/sec ~ 37.50 MB/sec
 Pub stats: 51,200 msgs/sec ~ 6.25 MB/sec
 Sub stats: 348,181 msgs/sec ~ 42.50 MB/sec
  [1] 73,669 msgs/sec ~ 8.99 MB/sec (1000 msgs)
  [2] 70,409 msgs/sec ~ 8.59 MB/sec (1000 msgs)
  [3] 70,246 msgs/sec ~ 8.58 MB/sec (1000 msgs)
  [4] 70,311 msgs/sec ~ 8.58 MB/sec (1000 msgs)
  [5] 69,915 msgs/sec ~ 8.53 MB/sec (1000 msgs)
  min 69,915 | avg 70,910 | max 73,669 | stddev 1,389 msgs

Since publish number was too low, I increased to get a better reading of perf:

root@9752a1669316:/home/ivan/go/src/github.com/nats-io/stan.go# 
root@9752a1669316:/home/ivan/go/src/github.com/nats-io/stan.go# go run examples/stan-bench/main.go -s nats://127.0.0.1:4222 -id bench -n 10000 -ms 128 -np 1 -ns 5 -io bench
Starting benchmark [msgs=10000, msgsize=128, pubs=1, subs=5]
NATS Streaming Pub/Sub stats: 529,759 msgs/sec ~ 64.67 MB/sec
 Pub stats: 88,293 msgs/sec ~ 10.78 MB/sec
 Sub stats: 464,723 msgs/sec ~ 56.73 MB/sec
  [1] 93,092 msgs/sec ~ 11.36 MB/sec (10000 msgs)
  [2] 93,276 msgs/sec ~ 11.39 MB/sec (10000 msgs)
  [3] 94,372 msgs/sec ~ 11.52 MB/sec (10000 msgs)
  [4] 93,193 msgs/sec ~ 11.38 MB/sec (10000 msgs)
  [5] 94,078 msgs/sec ~ 11.48 MB/sec (10000 msgs)
  min 93,092 | avg 93,602 | max 94,372 | stddev 520 msgs

I hope this helps. As for trying clustering, I highly doubt that performance would be better. It is known to lower performance compared to standalone mode. I am very sorry that performance is so low and again, I can't understand why it is that low. We never had report of such numbers, even with Docker (maybe some people have such low number but then perf would clearly not be of a concern for them).

powerman commented 4 years ago

Can you please show stats with -sync?

I've 3 servers with Ubuntu 18.04, which are supposed to have same hardware (in reality they're not, but they're very similar) and same configuration (HDD raid0 with bcache on SSD). Problem is, their results differ at order of magnitude: 2800 pub msg/s, 300 pub msg/s and 27 pub msg/s (with -sync and FILE without extra options). This time I'm using latest master both for nats-streaming-server and stan-bench (but this doesn't makes any noticeable differences except for -a and -sync). Without -sync results on these 3 servers are (pub msg/s): 114210, 12028, 9672.

powerman commented 4 years ago

I've also tried all 3 mentioned above servers with -dir on tmpfs with -sync, average values of pub msg/s are: 8800, 8700, 8400. Finally, we've some reasonable numbers!

I've compared sysctl -a on these servers but there are no important differences. They've same amount of RAM (32GB), same CPU, same OS/kernel. The main difference I noticed so far is HDD performance, but it's not large enough to explain "2800/300/27" difference: hdparm -t /dev/bcache0 shows 346.34 MB/sec, 198.41 MB/sec, 251.61 MB/sec.

So, it looks like main difference is in HDD/SSD models, which are very different:

powerman commented 4 years ago

Some more interesting stats. I've compared two servers at Digital Ocean. Both use same OS (Ubuntu 18.04.3), both has latest OS updates installed, both was rebooted before test so same kernel etc. and both doesn't run anything else (well, actually srv1 is running openvpn service). Their location differ a bit (AMS2 and AMS3), and so is hardware (1 CPU 1 GB RAM with 25 GB SSD, 2 CPU 8 GB RAM with 2 x 25 GB SSD).

On second server I see just one /dev/vda, so I suppose some sort of RAID is in place. This is also confirmed by I/O tests:

root@srv1:~# hdparm -t /dev/vda

/dev/vda:
 Timing buffered disk reads: 2852 MB in  3.00 seconds = 950.36 MB/sec

root@srv2:~# hdparm -t /dev/vda

/dev/vda:
 Timing buffered disk reads: 5622 MB in  3.00 seconds = 1873.93 MB/sec

root@srv1:~# ./iops -n 1 /dev/vda
/dev/vda,  26.84 G, sectorsize=512B, #threads=1, pattern=random:
 512  B blocks: 6140.4 IO/s,   3.1 MB/s ( 25.2 Mbit/s)
   1 kB blocks: 5547.9 IO/s,   5.7 MB/s ( 45.4 Mbit/s)
   2 kB blocks: 4437.3 IO/s,   9.1 MB/s ( 72.7 Mbit/s)
   4 kB blocks: 3338.7 IO/s,  13.7 MB/s (109.4 Mbit/s)
   8 kB blocks: 3347.2 IO/s,  27.4 MB/s (219.4 Mbit/s)

root@srv2:~# ./iops -n 1 /dev/vda
/dev/vda,  53.69 G, sectorsize=512B, #threads=1, pattern=random:
 512  B blocks: 26357.0 IO/s,  13.5 MB/s (108.0 Mbit/s)
   1 kB blocks: 18066.3 IO/s,  18.5 MB/s (148.0 Mbit/s)
   2 kB blocks: 17878.0 IO/s,  36.6 MB/s (292.9 Mbit/s)
   4 kB blocks: 10039.0 IO/s,  41.1 MB/s (329.0 Mbit/s)
   8 kB blocks: 12847.2 IO/s, 105.2 MB/s (842.0 Mbit/s)

The problem is stan-bench results doesn't match hardware: less powerful first server has 400 msg/s while much more powerful second server has only 150 msg/s.

user@srv1:~$ ./nats-streaming-server --cluster_id=devenv --store=FILE --dir=./stan &
user@srv1:~$ ./stan-bench -s nats://127.0.0.1:4222 -c devenv -id bench -n 1000 -ms 128 -np 1 -ns 5 -io -sync bench
Starting benchmark [msgs=1000, msgsize=128, pubs=1, subs=5]
NATS Streaming Pub/Sub stats: 2,465 msgs/sec ~ 308.15 KB/sec
 Pub stats: 410 msgs/sec ~ 51.37 KB/sec
 Sub stats: 2,056 msgs/sec ~ 257.04 KB/sec
  [1] 411 msgs/sec ~ 51.45 KB/sec (1000 msgs)
  [2] 411 msgs/sec ~ 51.42 KB/sec (1000 msgs)
  [3] 412 msgs/sec ~ 51.52 KB/sec (1000 msgs)
  [4] 411 msgs/sec ~ 51.46 KB/sec (1000 msgs)
  [5] 411 msgs/sec ~ 51.45 KB/sec (1000 msgs)
  min 411 | avg 411 | max 412 | stddev 0 msgs

user@srv2:~$ ./nats-streaming-server --cluster_id=devenv --store=FILE --dir=./stan &
user@srv2:~$ ./stan-bench -s nats://127.0.0.1:4222 -c devenv -id bench -n 1000 -ms 128 -np 1 -ns 5 -io -sync bench
Starting benchmark [msgs=1000, msgsize=128, pubs=1, subs=5]
NATS Streaming Pub/Sub stats: 848 msgs/sec ~ 106.06 KB/sec
 Pub stats: 141 msgs/sec ~ 17.68 KB/sec
 Sub stats: 708 msgs/sec ~ 88.55 KB/sec
  [1] 141 msgs/sec ~ 17.71 KB/sec (1000 msgs)
  [2] 141 msgs/sec ~ 17.71 KB/sec (1000 msgs)
  [3] 141 msgs/sec ~ 17.71 KB/sec (1000 msgs)
  [4] 141 msgs/sec ~ 17.71 KB/sec (1000 msgs)
  [5] 141 msgs/sec ~ 17.71 KB/sec (1000 msgs)
  min 141 | avg 141 | max 141 | stddev 0 msgs

They're running same OS, have same (default) OS setup (no tuning/sysctl/etc.), run same NATS binaries (latest master compiled using go1.13.4)… do you've any ideas why this happens? Any chance it's a bug in NATS Streaming or stan-bench tool?

kozlovic commented 4 years ago

I am not sure comparing iops and stan makes sense. In stan you are doing request/reply, single message at a time where each write is followed by an fsync. The iops tool may be writing blocks as fast as it can (not sure about doing fsync or not).

You can run the standalone NATS Streaming server and provide -profile 6060 that will allow go profiling. Once started, run the bench test and actually make sure that it runs long enough (increase message count if needed) and while the test is running, do:

go tool pprof --svg "http://localhost:6060/debug/pprof/profile?seconds=5"

this will produce an .svg file with a graph for the cpu profile. It also generates .gz that can then be used to be inspected interactively. If you could run this for 2 different systems where you see huge difference that may help. You can provide at least the .svg, or if you want the .gz files too.

They're running same OS, have same (default) OS setup (no tuning/sysctl/etc.), run same NATS binaries (latest master compiled using go1.13.4)… do you've any ideas why this happens? Any chance it's a bug in NATS Streaming or stan-bench tool?

As you said, it is all the same, including the NATS binaries, so not sure how this could be a bug, but maybe there are inefficiencies somewhere for some hardware/os/disk/etc.. combinations. But again, understand that we (NATS) are not doing anything special in term of os/hardware. Go is certainly having specific code/optimizations depending on platforms. Maybe the cpu profile will point to something?

powerman commented 4 years ago

As for trying clustering, I highly doubt that performance would be better. It is known to lower performance compared to standalone mode.

Well, turns out this isn't the case. Maybe because standalone mode obviously doesn't work as fast as it should, but, anyway, here are the stats.

I've (another ones) 3 servers with same hardware and same performance (140 msg/s). I've just started cluster there and get 500 msg/s on first node (probably leader) and 390 msg/s on other two nodes! I've used these commands to start cluster (benchmarking command is still the same):

srv1# ./nats-streaming-server --cluster_id=devenv --store=FILE --dir=./stan --cluster nats://10.1.1.1:6222 -routes nats://10.1.1.1:6222,nats://10.2.2.2:6222,nats://10.3.3.3:6222 -clustered -cluster_node_id prod01 -cluster_peers prod01,prod02,prod03
srv2# ./nats-streaming-server --cluster_id=devenv --store=FILE --dir=./stan --cluster nats://10.2.2.2:6222 -routes nats://10.1.1.1:6222,nats://10.2.2.2:6222,nats://10.3.3.3:6222 -clustered -cluster_node_id prod02 -cluster_peers prod01,prod02,prod03
srv3# ./nats-streaming-server --cluster_id=devenv --store=FILE --dir=./stan --cluster nats://10.3.3.3:6222 -routes nats://10.1.1.1:6222,nats://10.2.2.2:6222,nats://10.3.3.3:6222 -clustered -cluster_node_id prod03 -cluster_peers prod01,prod02,prod03
kozlovic commented 4 years ago

Because by default in clustering the file sync is disabled since we rely on the fact that the data is replicated on several nodes. You can overwrite with: --cluster_sync. That being said, I would not set that but then use the file_auto_sync to have periodic sync but still not do sync for each write.

powerman commented 4 years ago

Actually I hope clustering mode offers enough guarantee to not require file sync anymore. All I need is to make sure if publisher gets Ack and then one of NATS Streaming servers crash - Ack'ed message won't be lost by NATS Streaming. If this guarantee apply without file sync?

kozlovic commented 4 years ago

The publisher gets the ack only if the leader has guaranteed that the message was replicated through RAFT in the cluster, which means that there is a quorum of server that committed the message in their log. Of course, if the storage on several nodes is damaged then you would still lose data.

powerman commented 4 years ago

profile.nats-streaming-server.001.svg.gz profile.nats-streaming-server.002.svg.gz profile.nats-streaming-server.003.svg.gz pprof.nats-streaming-server.samples.cpu.001.pb.gz pprof.nats-streaming-server.samples.cpu.002.pb.gz pprof.nats-streaming-server.samples.cpu.003.pb.gz

kozlovic commented 4 years ago

I cannot make sense of the third capture, does not look like it capture the server doing much. For the 2 first, you clearly see that most time is spent on Syscall, which is split between Writev (to client sockets) and fd.Fsync, fd.Write and fd.Read.

Again, I am sorry but I don't understand why this is so slow on these machines.

lonerwolf commented 4 years ago

i am is so too。use file store is to slow

flyerge commented 4 years ago

Hi, I made small research, check nats-streaming-server on GCP K8S on different machine types. Mostly it is not a problem with nats-streaming-server, probably the problem with stan-bench. Depends how many CPU cores is used node with client stan-bench. Here is a screenshots with test results. All VMs had 8Gi RAM.

Machine type for server: n1-standard-8 Machine type for client: n1-standard-4/n1-standard-8 4cpu 8cpu

kozlovic commented 4 years ago

@flyerge Thanks, and that is a good point. The server will go as fast as the client can publish! I would also try to keep the number of go routines low compared to the number of CPUs. For instance with your setup there are 10 go routines doing publish. Now granted that your total number of messages is low (10000) which means only 1000 per publisher. But also keep in mind that the faster you will make the producer, the more the subscriptions may fall behind if there are many subs per message. When subscriptions will fall behind, new messages are appended in the log, but older messages need to be read back from disk to send to slow subscriptions.