Closed athiakos closed 6 years ago
Hi @alelos
Sorry for the delay; I've been busy at work. I'm looking into this now.
Hi,
No problem, let me know if you need help debugging
The bug is still there. I am running into it regularily now. I will add some logging to see when this happens exactly. I suspect an EAGAIN error in the socket operations.
Hey @ckeen
One possible source of the issue might be that running a new sam instance takes over the socket for any running instance, so:
I've added a lock file concept to the master branch that avoids that sequence of events (samB won't grab the socket), so that might help.
Now, sam
and B
are equivalent if another sam
owns the lock. To get around this, calling sam
with -S
will create a new instance of sam
without any external command channel.
Please try the newest code in master and let me know if it helps the issue.
Hey @ckeen
One possible source of the issue might be that running a new sam instance takes over the socket for any running instance, so:
- samA starts, grabs socket
- samB starts, grabs socket,
- samB exits, closes socket
- samA has a stale file handle and there's socket litter in the filesystem
I've added a lock file concept to the master branch that avoids that sequence of events (samB won't grab the socket), so that might help.
Now,
sam
andB
are equivalent if anothersam
owns the lock. To get around this, callingsam
with-S
will create a new instance ofsam
without any external command channel.
What I also noticed is that sam ignores sigpipe unconditionally while B gets killed silently with sigpipe under this condition. So if there is a stale handle, B ./foo.txt will just return w/o any message.
-- May you be peaceful, may you live in safety, may you be free from suffering, and may you live with ease.
Hi @ckeen
Just to confirm, you're seeing the SIGPIPE condition under the new code, or you saw it in the past?
Thanks.
Hi @ckeen
Just to confirm, you're seeing the SIGPIPE condition under the new code, or you saw it in the past?
In the past. I will watch the new master code closely.
-- May you be peaceful, may you live in safety, may you be free from suffering, and may you live with ease.
Awesome, thanks.
Ok, so on linux debian stable I need to define -D_DEFAULT_SOURCE or _XOPEN_SOURCE for the flock definition to be available.
Also I think the commit 1fb46ee - Treat sam invocations like B if another sam holds the lock breaks my setup with several samterms open as it does not respect the -r flag. It sets the appropriate lockfile but does not start a separate samterm.
Ok, so on linux debian stable I need to define -D_DEFAULT_SOURCE or _XOPEN_SOURCE for the flock definition to be available.
Also I think the commit 1fb46ee - Treat sam invocations like B if another sam holds the lock breaks my setup with several samterms open as it does not respect the -r flag. It sets the appropriate lockfile but does not start a separate samterm.
Also: Starting a new samterm instance with calling B the first time, does not work:
$ B sam/sam.c could not determine controlling socket name
-- May you be peaceful, may you live in safety, may you be free from suffering, and may you live with ease.
Just to confirm that the new master requires to define XSI extensions, aka X_OPEN_SOURCE version. I thought that the POSIX_C_SOURCE would cover it, but apparently newer versions of glibc exclude XSI by default, see: http://man7.org/linux/man-pages/man7/feature_test_macros.7.html
· (Since glibc 2.10) The value 200809L or greater addition‐
ally exposes definitions corresponding to the POSIX.1-2008
base specification (excluding the XSI extension).
To enable it either define them in sam/io.c like:
diff --git a/sam/io.c b/sam/io.c
index 944fa56..1991477 100644
--- a/sam/io.c
+++ b/sam/io.c
@@ -1,4 +1,10 @@
/* Copyright (c) 1998 Lucent Technologies - All rights reserved. */
+#if __STDC_VERSION__ >= 199901L
+#define _XOPEN_SOURCE 600
+#else
+#define _XOPEN_SOURCE 500
+#endif /* __STDC_VERSION__ */
+
Or alternative add it to the Makefile as -D_XOPEN_SOURCE=600 I will test the new functionality as well and report if the bug persists.
I personally would prefer the Makefile, there are already defines for POSIX features in there.
I updated config.mk.def to include _XOPEN_SOURCE=500
.
@ckeen Could you double-check that you copied the new sam
binary to B
and try again? It's working correctly for me.
I updated config.mk.def to include
_XOPEN_SOURCE=500
.
@ckeen Could you double-check that you copied the new
sam
binary toB
and try again? It's working correctly for me.
Ok, what I did was:
build sam, make install, manually removing and recreating B. Then I ran sam foo:
$ sam foo could not lock socket $
After removing the ~/.sam.*.lock files I straced:
open("/home/ckeen/.sam.localhost.lock", O_RDWR|O_CREAT, 054425) = 3 fcntl(3, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_CUR, l_start=0, l_len=0}) = 0 open("/home/ckeen/.sam.localhost.lock", O_RDWR|O_CREAT, 054125) = -1 EACCES (Permission denied) write(2, "could not lock socket\n", 22could not lock socket ) = 22
Also opening a file with B alone results in
$ B todo.org could not determine controlling socket name $
So something is still amiss.
$ ls -l .sam.localhost.lock -r-S--Sr-x 1 ckeen ckeen 0 Mar 22 16:21 .sam.localhost.lock $
Is that the same for you?
-- May you be peaceful, may you live in safety, may you be free from suffering, and may you live with ease.
I have a feeling the problem you're seeing might be related to the just-pushed fix (incorrect/missing argument to open). Sorry to be a pain, but would you mind trying once more with the latest master?
The B-without-sam-running problem is present for me though. Looking into that now.
I have a feeling the problem you're seeing might be related to the just-pushed fix (incorrect/missing argument to open). Sorry to be a pain, but would you mind trying once more with the latest master?
Yep, works for me(tm)
Thanks!
-- May you be peaceful, may you live in safety, may you be free from suffering, and may you live with ease.
@ckeen I'm going to go ahead and close this; I think everything's working as it should be for now. The whole thing is sub-optimal, but eventually it'll get cleaned up into something a bit more sane...
Hi,
There is an issue where the localhost socket is being closed after a period of time. I haven't determined the reason yet. I'm using the B command to open files. After a while I get an error: "could not determine controlling socket name, starting new instance" "could not open socket: No such file or directory"
Looking for the default socket ~/.sam.localhost and it seems it's closed/no where to be found.
What logs/debug would help you identify the issue? Should I run through gdb?