Open xxxserxxx opened 1 week ago
The work-around is to delete /tmp/cy-NNNN
, which appears to be the default socket.
I haven't dug into the code deeply; it does look like the code is, in some cases, trying to remove stale sockets. In this case, that branch is either not being hit, or it's failing.
A good solution would be, in the case of this error being hit, to test whether cy is actually running and if not, clean up and retry. Failing that, printing a message suggesting removing the socket if the user is sure cy isn't running would be helpful.
Thank you for submitting such a detailed report! I'm sorry this is giving you trouble; I'll look into it as soon as I'm able to. Personally I've experienced this myself, but only inconsistently, so I've been waiting to see (a) whether other users run into it and (b) if there's a reliable way to produce it. At the very least the error message could be improved.
NP. I think an easy way to reproduce it is to start cy, open a new terminal, and killall
the cy process. It'll leave behind a stale socket and prevent cy from being restarted. I imagine that in the case of a panic anywhere else in the program -- as hard to reproduce as that may be, or as unlikely -- it'd leave cy in the same state.
I spent a couple of hours today playing around with this and it's a bit more subtle than it seems. I have not been able to reproduce it reliably on macOS; using killall
to kill cy (even with SIGKILL
) still does not hinder restarting the server. It's possible this issue is more salient on Linux, so I'll have to give it a try there when I get a chance. The lock files that cause this to happen are unrelated to the cy server itself, they're just used by the daemonization library cy depends on. Theoretically a crash while these lock files are being created could cause them to not release properly, but even by inducing panics in the relevant branches I couldn't get that to happen.
To sum it up, it's going to take some more experimentation to figure out the circumstances in which this can happen (assuming my first attempt on Linux isn't fruitful.) As a result I'm not going to block the 1.2.0 release on this.
I'd be interested in hearing more about your initial report, though. You mentioned that cy hung when you tried to quit the server, is that correct?
The socket cy is creating on Linux is just a fifo pipe; if the process is killed, there's no magic that cleans it up, and killing (SIGKILL) the process will always leave them behind. Especially since cy sets up no syscall trap (which, to do, would have to be done per-OS). Now, I only dug lightly into the code, but iff a lock socket is left behind, then the cy server process won't start, and neither will cy. IMHO the issue isn't how often cy will encounter a stale socket, but how it reacts when it does: right now, it panics.
I don't think the solution is to bend over backwards trying to make sure cy exits cleanly - that would be nice, but there's always the possibility of a crash at some point. What I believe is needed is code to either fail nicely when the user tries to start cy and the socket is stale -- either by try/catch, or some other more clever mechanism. try/catch would be easiest.
To reproduce this, can you manually create a socket file on OSX when cy is not running and then try to start cy? You should see the panic. I can do this on linux by calling mkdir /tmp/cy-$(id -u) ; mkfifo /tmp/cy-$(id -u)/default
. Although, of course now that I'm trying to reproduce it this, I can't :-/
Took a closer look at this tonight and I found a solution. It turned out that this is unrelated to the cy server's socket.
The procedure cy uses to start up a server looks like this:
flock
) on an adjacent lock file used to ensure that multiple cy processes do not attempt to start a server with the same identity (e.g. default
) simultaneously.The error you provided (Resource temporarily unavailable
) comes from the daemonization library (go-daemon) used in step 2, specifically this line. This block creates a flock
ed .pid
file adjacent to the cy socket containing the process ID of the server. When flock
would block, this causes the Resource temporarily unavailable
error in the parent process, ie the terminal where you originally ran cy
.
When I first used go-daemon
I unthinkingly enabled this PID file feature thinking that it was a nice-to-have but unlikely to cause issues. In practice it relies on the server process exiting cleanly (and thus running the defer
that cleans up the lock) which is obviously a problem, since the lock on the PID file never gets freed.
So the commit I linked just disables this "feature". Perhaps someday it will be useful to restore a (non-locked) .pid
file adjacent to the cy
socket, but nothing in cy
reads from that file, it was just there for user convenience.
In addition, that Panic()
really shouldn't have been there; I'm of the mind that panics are mostly a code smell, it was just an oversight that I hadn't corrected after cleaning up cy
's subcommands a while back.
This change will be released in cy
version 1.3.0. I'd appreciate it if you tried it out early to ensure it's fixed! You can do so by checking out the release/v1.3.0
branch and running go install ./cmd/cy/...
.
I must have been looking at the wrong line, or maybe I was looking at the wrong code version -- but I'm glad you found it, and thank you so much for diligently looking for the issue. These hard-to-replicate sorts of things are a PITA to track down.
I agree on panic
s, obvs; however, I use them all the time, especially when first hacking out a structure and figuring out how I want everything to fit together. And sometimes one gets missed in the clean-up. But what else are you going to do? You can easily get lost in the weeds and miss the big picture trying to write everything perfectly from the start, and spend many hours writing code you're going to later throw away in a refactor. Anyway, commendable job tracking this down, and thanks again!
How to replicate:
killall -9 cy
)On my system, at least, cy will not start again, and no information about what needs to be cleaned to allow it to work.
cy could crash, or be killed; the session could crash; any number of things could happen that prevent cy from shutting down cleanly. I think what's needed is, rather than just dumping a panic, cy might provide some information to the user about what they need to do, or might need to do, to resolve this.
Environment:
OS: Linux
Terminal: wezterm
cy: installed with
go install github.com/cfoust/cy/cmd/cy@latest
Edit: in cleaning the copy/paste to remove shell decorations (PROMPT_RIGHT), I initially deleted the
y
incy
and "fixed" it with anf
. This has been corrected.