cybozu-go / goma

An extensible monitoring agent in Go.
MIT License
125 stars 13 forks source link

timeout does not work if the probe is a shell script #8

Open arosh opened 6 years ago

arosh commented 6 years ago

When we use the following shell script as a probe and example.com does not return a response for a long time, an execution of the probe will not stop in the specified timeout.

#!/bin/sh
printf "GET / HTTP/1.0\r\n\r\n" | nc example.com 80

An example of configuration file for goma is as follows.

[[monitor]]
name = "example"
interval = 1
timeout = 1
  [monitor.probe]
  type = "exec"
  command = "/path/to/probe"
  [[monitor.actions]]
  type = "mail"
  from = "no-reply@example.org"
  fail_to = ["admin@example.org"]

The reason for this is as follows.

  1. exec.CommandContext with context.WithTimeout will kill the process of the probe when the specified time has elapsed. However, the process of nc (the child process of the process of the probe) will not be killed since goma does not kill the whole process group.
  2. If nc does not close its stdout, cmd.Output() will wait to copy the nc process's stdout into the buffer.

See: https://github.com/golang/go/issues/18874#issuecomment-276551378