Theldus / wsServer

wsServer - a tiny WebSocket server library written in C
https://theldus.github.io/wsServer
GNU General Public License v3.0
422 stars 80 forks source link

Stress testing send_receive.c fails #40

Closed pthreat closed 2 years ago

pthreat commented 2 years ago

I stress tested the send_receive example with artillery.io

--------------------------------------
Metrics for period to: 10:55:20(-0300) (width: 9.004s)
--------------------------------------

vusers.created_by_name.0: ................................... 80
vusers.created.total: ....................................... 80
vusers.failed: .............................................. 24
vusers.completed: ........................................... 56
vusers.session_length:
  min: ...................................................... 1.2
  max: ...................................................... 12.4
  median: ................................................... 5.9
  p95: ...................................................... 10.3
  p99: ...................................................... 10.9
websocket.send_rate: ........................................ 29/sec
websocket.messages_sent: .................................... 224
errors.Parse Error: Expected HTTP/: ......................... 24

I have tried to increase the amount of clients and ports and recompiling the example, however it still fails.

Do I need to augment file descriptors on my system in order to allow many clients sending and receiving at the same time?

PS: I like the interface of this library it's very intuitive, great work!

pthreat commented 2 years ago

Will enable debugging today to see if I can get more information about what's causing the errors

pthreat commented 2 years ago

Got nothing useful out of enabling debug, I still see the output addr being set to null for some reason, I have increased clients to 200000 and also ports, no idea what's going on

Theldus commented 2 years ago

Hi @pthreat, This is curious, could you tell me how you performed these tests? I'm not familiar with artillery.io.

AFAIR, wsServer manages to pass in Autobahn's massconnect mode (for something like 1k clients), so I can't say exactly what's going on.

Perhaps the "vusers.failed" occurs due to some kind of timeout, because wsServer takes a while to respond or something like that (since wsServer is very simple in connection management, etc).

Got nothing useful out of enabling debug, I still see the output addr being set to null for some reason, I have increased clients to 200000 and also ports, no idea what's going on

200k clients is indeed a massive amount of clients, and I would be very surprised if wsServer could manage this, since wsServer is not async and uses one thread per client.

If you need a server for such a large number of clients, maybe wsServer is not the right server for you =/.

pthreat commented 2 years ago

@Theldus Hey man, glad to see you around, I'll paste the test script:

Put this in stress.yml

config:
  target: "ws://127.0.0.1:8080"
  phases:
    - duration: 100
      arrivalRate: 8
  ws:
    # Ignore SSL certificate errors
    # - useful in *development* with self-signed certs
    rejectUnauthorized: false
scenarios:
  - engine: "ws"
    flow:
      - send: '{"event":"hello", "token":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx", "application":"frontend"}'
      - send: '{"event":"hello", "token":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx", "application":"frontend"}'
      - send: '{"event":"hello", "token":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx", "application":"frontend"}'
      - send: '{"event":"hello", "token":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx", "application":"frontend"}'

Install artillery (through node's npm)

npm -g install artillery

Then:

artillery run stress.yml
pthreat commented 2 years ago

FWIW I have tried node https://github.com/websockets/ws and tests showed up no problems.

Theldus commented 2 years ago

Hi @pthreat, Thanks for the detailed instructions. I was able to configure Artillery and see results similar to yours. I didn't get such a high error rate but I constantly get "errors.Parse Error: Expected HTTP/:" in the output.

I ran artillery against Autobahn's echoserver and also against websocat, and both without any errors, which really implies something is happening with wsServer.

Running artillery in debug mode (DEBUG=ws* artillery run stress.yml), sometimes I get the following output:

2022-01-07T02:14:51.636Z ws Error: Parse Error: Expected HTTP/
    at Socket.socketOnData (_http_client.js:515:22)
    at Socket.emit (events.js:376:20)
    at addChunk (internal/streams/readable.js:309:12)
    at readableAddChunk (internal/streams/readable.js:284:9)
    at Socket.Readable.push (internal/streams/readable.js:223:10)
    at TCP.onStreamRead (internal/stream_base_commons.js:188:23)

I would like to take a closer look at this stacktrace, but my knowledge of Node is pretty minimal.

Reading here, it seems to me to be a problem that occurs in the HTTP parsing, probably in the handshake response, since this exact message can occur in other contexts, outside of Artillery.

I have the feeling that if I analyze via Wireshark I could find out what the 'bad response' of wsServer is, the problem is locating, as it is quite random and occurs in small quantity.

pthreat commented 2 years ago

@Theldus Sorry, got lost for a few days, would be awesome if a high load would be manageable, I mean if node can manage it so can wsServer ;) . I do know artillery plays dirty doesn't even sends pong or whatever, it plays dirty and that's why I think it's a good stress testing tool. Hope you can crush the issue :)

Theldus commented 2 years ago

@pthreat No worries =), Yes, I care a lot about wsServer being 'correct' (which is why there are automated fuzzying tests with Autobahn and AFL ;-)), so another tool like Artillery looks pretty interesting to me.

I do know artillery plays dirty doesn't even sends pong or whatever [...]

PING/PONG frames shouldn't exactly be a problem, wsServer works fine without them. What wsServer expects to correctly receive is a valid handshake, as well as valid message frames (TXT, BIN, CONT...). Abnormal closure (no close handshake) also works.

What intrigues me the most is that the error is random: sometimes it happens, sometimes it doesn't, which makes debugging very difficult.

But I will continue investigating here and I will keep you informed about any progress ;-).

Theldus commented 2 years ago

Hi @pthreat, Commit d66b613 fixes this issue.

There were two things happening: a) There was a race condition during the disconnect process, where the 'client_socks' structure was not fully protected with locks, and wsServer tried to send messages to already disconnected clients.

b) During the message broadcast (send_receive.c example does), wsServer did not check if the connected clients were already in the 'OPEN' state, that is, if they had already performed the handshake, which made the server send messages prematurely to the clients (hence the Parse Error: Expected HTTP/ error).

Thanks for noticing this, race conditions are generally very difficult to spot and fix, without Artillery this error would remain unnoticed for a long time.

Now I get the following output with Artillery:

--------------------------------
Summary report @ 04:22:23(-0300)
--------------------------------

vusers.created_by_name.0: ................................... 800
vusers.created.total: ....................................... 800
vusers.completed: ........................................... 800
vusers.session_length:
  min: ...................................................... 0.9
  max: ...................................................... 11.1
  median: ................................................... 1.3
  p95: ...................................................... 1.9
  p99: ...................................................... 3.4
websocket.send_rate: ........................................ 32/sec
websocket.messages_sent: .................................... 3200

Please let me know if everything is ok so I can close the issue, and again, thank you very much for spotting this =).

pthreat commented 2 years ago

Hey Theldus sorry I didn't reply earlier. Will check it out when I have sometime trying to stress it! Results look promising. Thanks for fixing this <3