githubnemo / CompileDaemon

Very simple compile daemon for Go
BSD 2-Clause "Simplified" License
1.61k stars 153 forks source link

Cannot seem to kill the process properly #65

Closed Rambatino closed 3 years ago

Rambatino commented 3 years ago

I'm obviously missing something, but I can't seem to catch the SIGTERM signal and gracefully terminate the running server.

package main

import (
    "context"
    "fmt"
    "log"
    "net/http"
    "os"
    "os/signal"
    "sync"
    "syscall"

    "github.com/gorilla/mux"
)

func main() {
    log.Println("Starting HTTP server on port", os.Getenv("PORT"))

    httpServerExitDone := &sync.WaitGroup{}

    httpServerExitDone.Add(1)
    srv := startHTTPServer(httpServerExitDone)

    c := make(chan os.Signal, 1)
    signal.Notify(c,
        syscall.SIGHUP,
        syscall.SIGINT,
        syscall.SIGTERM,
        syscall.SIGQUIT)

    go func() {
        <-c
        fmt.Println("\r- SIG sent to terminate")
        if err := srv.Shutdown(context.TODO()); err != nil {
            panic(err)
        }
    }()

    httpServerExitDone.Wait()
    log.Printf("main: done. exiting")
}

func startHTTPServer(wg *sync.WaitGroup) *http.Server {
    r := mux.NewRouter()
    srv := &http.Server{Addr: ":" + os.Getenv("PORT"), Handler: r}

    r.Handle("/", http.FileServer(http.Dir("./public")))

    go func() {
        defer wg.Done()

        if err := srv.ListenAndServe(); err != http.ErrServerClosed {
            log.Fatalf("ListenAndServe(): %v", err)
        }
    }()

    return srv
}

Results in:

2021/06/30 23:04:08 Gracefully stopping the current process..
2021/06/30 23:04:08 Restarting the given command.
2021/06/30 23:04:08 stderr: 2021/06/30 23:04:08 Starting HTTP server on port 3000
2021/06/30 23:04:08 stderr: 2021/06/30 23:04:08 ListenAndServe(): listen tcp :3000: bind: address already in use
2021/06/30 23:04:08 stderr: exit status 1
Rambatino commented 3 years ago

Command: PORT=3000 CompileDaemon -command="go run main.go"

Both graceful and not result in the same output.

githubnemo commented 3 years ago

If I had to venture a guess then it is probably due to the socket not being closed fast enough (still waiting for an ACK from some connection, for example) and is still in CLOSE_WAIT state. You can verify this by running netstat -ton | grep CLOSE_WAIT in parallel.

There is no real fix for this other than terminating the connections beforehand (or not keeping them open at first) or to wait some time before the command the next time (in a pickle you could create a shell script that runs sleep 1 && go run main.go, for example).

Rambatino commented 3 years ago

So I've found the reason, it's because it's not actually killing the child processes.

Even if I do the second thing in a bash script, it doesn't kill the server pid it kills the parent process pid. See: https://stackoverflow.com/questions/24982845/process-kill-on-child-processes

When I re-execute the binary it works and kills and restarts it properly (although the code obviously doesn't change).

I would argue that this library should kill the child processes too.

Rambatino commented 3 years ago

ahh my bad there's a build command, so that does work! Thanks!