zigzap / zap

blazingly fast backends in zig
MIT License
2.4k stars 79 forks source link

TLS + Multiple threads will crash the process randomly #107

Open richard-powers opened 6 months ago

richard-powers commented 6 months ago

(Originally discussed in https://github.com/zigzap/zap/issues/64)

I've reproduced this on our live VPS: Ubuntu 22.04.4 LTS, openssl version 3.0.10, as well as my local machine: Arch Linux with openssl version 3.3.0 (both are x86_64), zig version 0.12.0

Some errors seen:

Segmentation fault at address 0x30
???:?:?: 0x740f2c26ed64 in ??? (libcrypto.so.3)
Unwind information for `libcrypto.so.3:0x740f2c26ed64` was not available, trace may be incomplete

malloc(): unsorted double linked list corrupted UUID error: 0x810 (0) free(): corrupted unsorted chunks

It seems to only be crashing when using multiple threads (only 1 worker). We normally use 200 threads with 1 worker, and while doing this I was able to crash the server just short of 500 queries.

After limiting the server to 1 thread and 1 worker, I could no longer cause a crash. (It also seems like less threads = less likely to crash? This is difficult to verify) I would normally believe this to be a problem somewhere with my code, but the server will not crash if TLS is disabled.


I was able to recreate it with the https example simply by changing this:

zap.start(.{
    .threads = 200,
    .workers = 1,
});

I then ran a bit of bash to flood the endpoint with curl:

i=1
while true; do
  curl -k -X 'GET' 'https://0.0.0.0:4443' || break
  echo "$i"
  i=$((i+1))
  sleep 0.05;
done

First time trying, i was 1858 when it crashed. It isn't consistent, so you may need to let it run a little while, but it will crash eventually

dwolrdcojp commented 6 months ago

We're going to attempt to reproduce this https example in facil.io as well to verify if it's only Zig / Zap related.

dwolrdcojp commented 5 months ago

We've decided to use Nginx with a reverse proxy to our Zap server and let Nginx handle HTTPS with certbot. By disabling the SSL / TLS in Zap we are able to run more than one thread without seg faulting. This is our work around for now, if you'd like us to do further testing or investigation let us know. It's not top of the priority list for us at the moment with Nginx running.