crigler / dtach

A simple program that emulates the detach feature of screen
GNU General Public License v2.0
476 stars 50 forks source link

dtach -n always exits with -1 when the command exits #12

Closed hummerbliss closed 5 years ago

hummerbliss commented 6 years ago

When ever dtach exits because the command started by dtach finishes it always seems to exit with -1 when started in dtach -n mode.

For example,

strace -s 8192 -ff ./dtach -n /tmp/f /bin/sh -c "sleep 1"

Produces the following snipped output,

<... select resumed> )                  = 1 (in [4])
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=14569, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 1
read(4, 0x7ffdb1138830, 4096)           = -1 EIO (Input/output error)
unlink("/tmp/f")                        = 0
exit_group(1)                           = ?
+++ exited with 1 +++
hummerbliss commented 6 years ago

This patch works for me. Inspired from (https://bugs.python.org/file16863/io-openpty.patch). Is something like this acceptable ?

--- a/master.c
+++ b/master.c
@@ -253,8 +253,12 @@ pty_activity(int s)
    len = read(the_pty.fd, buf, sizeof(buf));

    /* Error -> die */
-   if (len <= 0)
-       exit(1);
+   if (len <= 0) {
+    if (len == -1 && errno == EIO)
+          exit(0);
+    else
+          exit(1);
+  }
crigler commented 6 years ago

Here is an expanded version of the patch which tries to save the exit code. See if it works for you:

--- a/dtach.h
+++ b/dtach.h
@@ -71,6 +71,7 @@
 #include <sys/stat.h>
 #include <sys/socket.h>
 #include <sys/un.h>
+#include <sys/wait.h>

 #ifndef S_ISREG
 #define S_ISREG(m) (((m) & S_IFMT) == S_IFREG)
--- a/master.c
+++ b/master.c
@@ -28,6 +28,8 @@ struct pty
 #endif
    /* Process id of the child. */
    pid_t pid;
+   /* Exit code of the child. */
+   int exit_code;
    /* The terminal parameters of the pty. Old and new for comparision
    ** purposes. */
    struct termios term;
@@ -72,6 +74,14 @@ die(int sig)
    /* Well, the child died. */
    if (sig == SIGCHLD)
    {
+       int status;
+
+       if (waitpid(-1, &status, WNOHANG) > 0)
+       {
+           if (WIFEXITED(status))
+               the_pty.exit_code = WEXITSTATUS(status);
+       }
+
 #ifdef BROKEN_MASTER
        /* Damn you Solaris! */
        close(the_pty.fd);
@@ -134,7 +144,7 @@ init_pty(char **argv, int statusfd)
        printf("%s: could not execute %s: %s\r\n", progname,
               *argv, strerror(errno));
        fflush(stdout);
-       _exit(127);
+       _exit(1);
    }
    /* Parent.. Finish up and return */
 #ifdef BROKEN_MASTER
@@ -254,7 +264,17 @@ pty_activity(int s)

    /* Error -> die */
    if (len <= 0)
-       exit(1);
+   {
+#ifdef EIO
+       if (len < 0 && errno == EIO)
+           len = 0;
+#endif
+
+       if (len == 0)
+           exit(the_pty.exit_code);
+       else
+           exit(1);
+   }

 #ifdef BROKEN_MASTER
    /* Get the current terminal settings. */
dhanjit commented 5 years ago

Any updates on this issue? Results in failure in systemd units when attempting to stop the service

crigler commented 5 years ago

I never received a reply from @hummerbliss saying whether my expanded version of the patch worked for them or not.

Maybe you could try it and see if there are any issues with it?

dhanjit commented 5 years ago

@crigler I tried it.

patched version: ~/resources/dtach/dtach current version: /usr/bin/dtach.

~
❯ ~/resources/dtach/dtach -N /tmp/a1 echo "ASD" && echo "success" || echo "failed"
success

~
❯ /usr/bin/dtach -N /tmp/a1 echo "ASD" && echo "success" || echo "failed"
failed

Also tried with a complex example with rtorrent. It works.

Also the OP's example works.

~
❯ strace -s 8192 -ff ~/resources/dtach/dtach -n /tmp/f /bin/sh -c "sleep 1" |& grep 'exited with'
[pid 12626] +++ exited with 0 +++
[pid 12628] +++ exited with 0 +++
+++ exited with 0 +++

~
❯ strace -s 8192 -ff /usr/bin/dtach -n /tmp/f /bin/sh -c "sleep 1" |& grep 'exited with'
[pid 12641] +++ exited with 0 +++
[pid 12643] +++ exited with 0 +++
+++ exited with 1 +++

However, I would like to ask does this have any bearing on what the executed command returns? Actually I checked with top after I did kill -9 <toppid>

The patched version outputs:

❯ ~/resources/dtach/dtach -N /tmp/a1 top
[1]    12274 killed     ~/resources/dtach/dtach -N /tmp/a1 top

~ 7s
❯ echo $?
137

The same by the nonpatch version.

So I am assuming dtach transmits the executed commands error code.

hummerbliss commented 5 years ago

Apologies for missing the updates. Works well for me.