Open apeyser opened 3 days ago
That error message isn't neccicarily a sign of something going wrong, since the shpool attach
process will probe the control socket to see if someone is listening in order to decide if it needs to autodaemonize. It just hangs up immediately while the daemon tries to initiate the handshake, causing this error to get generated in the daemon logs, but it doesn't actually indicate something is going wrong.
Can you post some step-by-step instructions for how to reproduce the issue? I've had ssh credential timeouts without seeing issues with shpool, so I'm not quite sure how to try to reproduce this.
pkill ssh-agent
is enough to trigger it for me (once) -- but not necessary, since the usual condition doesn't involve restarting the ssh-agent, but simply allowing the creds to go stale and/or (unclear) letting the ssh control master time out. But it seems to be extremely flaky -- reproducing the failure is hard.
shpool doesn't really know anything about ssh, so the problem probably isn't directly related to ssh. There is probably some way to reproduce the issue purely with shpool commands, though it might be hard to find.
Both with mosh & ssh + shpool, I've been finding shpool process hanging. If the shpool process is directly
kill
ed, then detach + list work fine, but if I try an shpool list with the process live, the daemon hangs and asystemctl --user restart shpool
is need -- aka, everything gettingkill
ed.It appears connected with ssh credentials timing out; refreshing the credentials doesn't seem to solve the problem.
What I find in the logs is:
shpool version 0.7.0