markbates / refresh

MIT License
191 stars 30 forks source link

Getting signal: killed, then it stops rebuilding #6

Closed nullbio closed 7 years ago

nullbio commented 7 years ago

I'm using refresh as a library, and after I change a file sometimes it works and rebuilds fine, but then after a couple of goes (usually its on the first attempt though) I get this in the console, and then it stops rebuilding the files on change:

refresh: 2017/03/27 22:44:38 === signal: killed

Any idea why this might be happening, or how I can diagnose the problem?

Full output:

refresh: 2017/03/27 22:47:16 === Rebuild on: :start: ===
refresh: 2017/03/27 22:47:16 === Running: go build -v -i -o /tmp/x-build (PID: 30957) ===
github.com/x/x
refresh: 2017/03/27 22:47:16 === Building Completed (PID: 30957) (Time: 677.734852ms) ===
refresh: 2017/03/27 22:47:16 === Running: /tmp/x-build (PID: 30998) ===
2017-03-27T22:47:16.835+1000    INFO    /x/app/server.go:61 starting http listener  {"bind": ":4000"}
refresh: 2017/03/27 22:47:21 === Rebuild on: main.go ===
refresh: 2017/03/27 22:47:21 === Running: go build -v -i -o /tmp/x-build (PID: 31051) ===
github.com/x/x
refresh: 2017/03/27 22:47:22 === Building Completed (PID: 31051) (Time: 694.52791ms) ===
refresh: 2017/03/27 22:47:22 === Stopping: PID 30998 ===
refresh: 2017/03/27 22:47:22 === Running: /tmp/x-build (PID: 31094) ===

Using go 1.8.

markbates commented 7 years ago

I haven't seen that before. Is something panicking perhaps? Is your app not responding? It's hard to tell from the output where isn't dying. Is something killing the file watcher?

nullbio commented 7 years ago

App works fine when run by itself. I tried replicating the build process that refresh does (i.e. go build -v -i -o /tmp/file) and I wasn't getting the error. I don't believe anything is panicking. Could it be related to using go 1.8 graceful shutdown for ListenAndServe at all? The file watcher keeps running (refresh.Start() doesn't return), it's like it just stops receiving file change events or something.

nullbio commented 7 years ago

I don't know why, but on average it seems to happen around the 4th file save, but it's not consistent, sometimes it's sooner or later. I'm also waiting a bit before I save the files too, makes no difference.

nullbio commented 7 years ago

Further update, I forked the code to see if I could figure out what's happening, I put prints where the event notifs come through, on the first file save I got:

"main.go": CHMOD
"main.go": REMOVE
"main.go": RENAME
"main.go": CHMOD

Second file save got:

"main.go": REMOVE
"main.go": RENAME
"main.go": CHMOD

And then after that, nothing. Doesn't seem to be picking up any further saves or notifications from there.

Does this indicate a bug in the fsnotify library?

markbates commented 7 years ago

It could be the graceful shutdown stuff. Could you try removing that and see if everything works as expected?

My theory is the old binary isn't shutting down quickly, which means that when the new binary goes to start, it can't.

Just a theory.

nullbio commented 7 years ago

I've narrowed it down to something blocking in here:

case event := <-w.Events:
                if event.Op != fsnotify.Chmod {
                    go r.build(event)
                }
                w.Remove(event.Name)
                w.Add(event.Name)
                fmt.Println("end of event")

When it breaks, it never hits the "end of event" println. Will keep digging. Thanks.

nullbio commented 7 years ago

Okay, it appears the problem is that it's blocking in here:

// wait until ignoreLinux() deleting maps
    exists := true
    for exists {
        w.cv.Wait()
        _, exists = w.watches[name]
    }

This is in the Remove func (manager.go). The w.cv.Wait() call is permanently blocking.

markbates commented 7 years ago

Is that in fsnotify? I don't see that in refresh.

nullbio commented 7 years ago

Ah you're right sorry, vim had manager.go highlighted but it's actually inotify.go:

https://github.com/fsnotify/fsnotify/blob/master/inotify.go#L160

nullbio commented 7 years ago

I think it might be this: https://github.com/fsnotify/fsnotify/issues/195 -- Seems like an fsnotify bug after all.

markbates commented 7 years ago

Wow. Good work tracking that down! I'm going to close this issue for now, since it appears to be an fsnotify issue.

nullbio commented 7 years ago

This should be affecting most (if not all) Linux users I would imagine. What platform do you use? There is a workaround that can be implemented to fix this on the refresh library side of things, and given that this fsnotify issue can be dated back at least as far as Jan 2016 I don't think the developers have any intention of fixing it.

markbates commented 7 years ago

I'm on Mac. If there's a work around, I'm happy to accept a PR for it. I'll reopen this issue.

nullbio commented 7 years ago

@markbates We've submitted a PR to fsnotify to fix the bug, hopefully they can merge it in short order. I'll close this issue in the mean time. Thanks.

markbates commented 7 years ago

It just got merged. Thanks for finding this.

nullbio commented 7 years ago

Awesome. Thanks for making refresh! I've got another (small) issue as well, but I'll open a separate issue for that. Thanks.