sourcegraph / sourcegraph-public-snapshot

Code AI platform with Code Search & Cody
https://sourcegraph.com
Other
10.1k stars 1.28k forks source link

syntax-highlighter panics when receiving ctrl-c #60407

Open keegancsmith opened 8 months ago

keegancsmith commented 8 months ago

I was running the single server docker image. I hit ctrl-c to shut it down. Everything but the syntax highligher shut down gracefully. In the case of that it seemed to just hang. This is almost certainly an issue with our customers that use this image as well. I did not test if it behaves poorly in the normal container we deploy on k8s.

14:31:24        syntax_highlighter | thread 'rocket-worker-thread' panicked at library/std/src/io/stdio.rs:1019:9:
14:31:24        syntax_highlighter | failed printing to stdout: Broken pipe (os error 32)
14:31:24        syntax_highlighter | note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

ps output in container after hitting ctrl-c. notice everything else except the supervisor (/server) has quit

# ps aux
PID   USER     TIME  COMMAND
    1 root      0:00 /sbin/tini -- /server
    7 root      0:00 /server
  164 root      0:00 sh -c env QUIET=true ROCKET_ENV=production ROCKET_PORT=9238 ROCKET_LIMITS='{json=10485760}' ROCKET
  172 root      0:05 syntax_highlighter
 4084 root      0:00 /bin/sh -il
 4090 root      0:00 ps aux

image

Also notice that syntax highlighter (as configured for this container) is super spammy on its log output.

FYI I ran the container like this:

docker run --publish 7080:7080 --rm --volume sourcegraph-config:/etc/sourcegraph --volume sourcegraph-data:/var/opt/sourcegraph sourcegraph/server:5.3.0-rc.1
keegancsmith commented 8 months ago

I just realised the full envs we set for syntax-highlighter may be useful. Here it is

d8b3568bec6d:/# ps -f | grep syntax
  164 root      0:00 sh -c env QUIET=true ROCKET_ENV=production ROCKET_PORT=9238 ROCKET_LIMITS='{json=10485760}' ROCKET_SECRET_KEY='SeerutKeyIsI7releuantAndknvsuZPluaseIgnorYA=' ROCKET_KEEP_ALIVE=0 ROCKET_ADDRESS='"127.0.0.1"' syntax_highlighter | grep -v "Rocket has launched" | grep -v "Warning: environment is"

I also just double check the image with our last release (5.2.7) and the same issue occurs.

keegancsmith commented 8 months ago

cc @jhchabran any chance you have noticed slow shutdowns when running single server docker images in CI for tests? From my local testing everything shutsdown instantly, but this part hangs. Might be a win on CI time unless you aggressively kill the container that is.

varungandhi-src commented 8 months ago

@kritzcreek was not able to reproduce this with just the highlighter binary alone. This might potentially be due to some interaction with http-server-stabilizer which sits in the middle.

keegancsmith commented 8 months ago

@varungandhi-src http-server-stabilizer isn't running inside of the single server docker image. We run the syntect_server binary directly (note in the above the binary is renamed to syntax_highlighter, I am avoiding this bit of confusion here https://github.com/sourcegraph/sourcegraph/pull/60557)

I was just reading some of our process manager code in sourcegraph server. On the first ctrl-c it recieves it actually sends a SIGINT. Another thing to take into account when reproducing is that syntect_highlighter is run in that wrapped shell (see ps output above). Thinking about it the wrapped shell may be the issue. Maybe something to do with grep reading its stdout? Or some weird behaviour around passing on signals. The only reason we have the wrapped shell is because of the annoying logspam in syntect_highlighter. We can remove the wrapped shell if it stopped log spammin gthose messages we grep out.

keegancsmith commented 8 months ago

yup I am fairly certain it is to do with the syntect_highlighter binary behaving badly if writing to stdout is failing. See the original report, in particular this log line

14:31:24        syntax_highlighter | failed printing to stdout: Broken pipe (os error 32)

And when we kill we are killing the grep that syntax_highlighter is writing into for stdout... leading to a panic in syntax_highlighter and it giving up on living but not deciding to die.

varungandhi-src commented 8 months ago

I don't understand the hang, I get a clean panic locally, if I kill the process the output is being piped to. (I ran pkill tee in a separate pane)

❯ RUST_BACKTRACE=1 cargo run --bin syntect_server | tee
    Finished dev [unoptimized + debuginfo] target(s) in 0.13s
     Running `target/debug/syntect_server`
thread 'main' panicked at library/std/src/io/stdio.rs:1019:9:
failed printing to stdout: Broken pipe (os error 32)
stack backtrace:
   0: rust_begin_unwind
             at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:595:5
   1: core::panicking::panic_fmt
             at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/pa

The only reason we have the wrapped shell is because of the annoying logspam in syntect_highlighter. We can remove the wrapped shell if it stopped log spammin gthose messages we grep out.

Can you point me to the code for this? I can reduce the logging in the highlighter and we can remove this grep layer.

keegancsmith commented 8 months ago

Edit: I said the below, but I am currently rethinking it. The problem is we see the panic in the log output. So something funky is actually happening in syntect_server. Likely our process manager isn't sending SIGINT to the bash process, but rather is sending it to the process group. Try doing something like sh -c 'syntect_server | grep -v foo' and then doing something like kill -INT -$PID_OF_SH. Notice the - in front of PID_OF_SH. That tells kill to kill the process group of the PID, not just the process.

See below, but to summarize if you can quickly reduce the logspam of syntect_server that would be greatly appreciated. If not, I will look into more robust filtering of logoutput in server. It shouldn't be difficult, but it only helps server instead of all our environments.

I don't understand the hang, I get a clean panic locally, if I kill the process the output is being piped to. (I ran pkill tee in a separate pane)

Right so I debugged this further. It has everything to do with wrapping this in a shell script. When bash (or whatever shell is in our images) receives a SIGINT, it doesn't forward it. Instead, it "stops any for loops". So TIL that is why I have to spam ctrl-c when killing bash scripts sometimes.

IE this is not an issue in syntect_server at all.

The only reason we have the wrapped shell is because of the annoying logspam in syntect_highlighter. We can remove the wrapped shell if it stopped log spammin gthose messages we grep out.

Can you point me to the code for this? I can reduce the logging in the highlighter and we can remove this grep layer.

There should be no log output at all with the environment variables we set (see issue description). If you can make that happen, then we can remove the bash layer.