ksh93 / ksh

ksh 93u+m: KornShell lives! | Latest release: https://github.com/ksh93/ksh/releases
Eclipse Public License 2.0
192 stars 32 forks source link

`sleep 10 | read` hangs interactive shell after ^Z #750

Open McDutchie opened 6 months ago

McDutchie commented 6 months ago

Found on comp.unix.shell:

From: Christian Weisgerber <naddy@mips.inka.de>
Newsgroups: comp.unix.shell
Subject: ksh93: pipelines vs. job control
Date: Mon, 13 May 2024 14:35:32 -0000 (UTC)
Message-ID: <slrnv4499k.30g3.naddy@lorvorc.mips.inka.de>
User-Agent: slrn/1.0.3 (FreeBSD)

... or "how to accidentally hang your ksh93 session".

Bash has this shell option:

    lastpipe
            If set, and job control is not active, the shell runs
            the last command of a pipeline not executed in the
            background in the current shell environment.

The bit about job control sounds like a weird stipulation, but makes
sense once you think about it.  I wonder how ksh93 handles that.
Famously, the AT&T ksh executes the last command of a pipeline in
the current shell:

  $ x=foo
  $ { /bin/sleep 10; echo bar; } | read x
  $ echo $x
  bar

However, a background pipeline runs in a subshell, as documented
in the man page:

  $ x=foo
  $ { /bin/sleep 10; echo bar; } | read x &
  [1]     78726
  $ 
  [1] +  Done                    { /bin/sleep 10; echo bar; } | read x &
  $ echo $x  
  foo

But we are in job control environment.  I can start a pipeline in
the foreground, suspend it, and put it into the background.  ksh93
can't know about that in advance.

  $ { /bin/sleep 10; echo bar; } | read x  
  ^Z

... and now the pipeline is suspended...

    PID  PGID STAT   COMMAND
  71496 71496 S      - ksh93 ksh93
  62980 62980 T+     `-- ksh93 ksh93
   3778 62980 T+p      `-- /bin/sleep 10

... and you're stuck.  There's no shell prompt, so you can't bg or fg
anything, intr (^C) or quit (^\) don't help, and there's no tty
control character to continue a suspended process (group).

You need to take radical measures like hanging up, or sending a
SIGCONT or SIGKILL from a different terminal.

I can't quite tell if this behavior constitutes a bug, but it seems
to follow from the design decision to not run the last command of
a pipeline in a subshell.  And it can trap the unwary.

A more minimal example:

  $ /bin/sleep 10 | read
  ^Z

  PID  PGID STAT   COMMAND
  30796 30796 S      - ksh93 ksh93
  31981 31981 T+p    `-- /bin/sleep 10

-- 
Christian "naddy" Weisgerber                          naddy@mips.inka.de
lzaoral commented 4 months ago

Thank you for the pointers in https://github.com/ksh93/ksh/pull/763#issuecomment-2249668652, @McDutchie! I'd say it is a different bug than #750, tough. The following line seems to cause the problem:

https://github.com/ksh93/ksh/blob/9ea4e32028411e2ad2dedb89974085937d9d7980/src/cmd/ksh93/bltins/sleep.c#L143

t is a double but inf cannot be properly represented in int so the assignment of n will evaluate -1 on my machine when debugged in GDB:

...
(gdb) p t
$6 = inf
(gdb) p n
$7 = -1
...

This conversion is actually not valid in C as shown by this experiment

#include <math.h>
#include <stdio.h>

int main(void) {
    double d = INFINITY;
    printf("double = %f, int = %d\n", d, (int) d);
}
$ clang -fsanitize=undefined inf.c && ./a.out
inf.c:6:42: runtime error: inf is outside the range of representable values of type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior inf.c:6:42
double = inf, int = 2147483647

and the C99 standard itself:

6.3.1.4 Real floating and integer

When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined.50)

McDutchie commented 4 months ago

Thanks for that. Looks like sleep inf was simply unimplemented and accidentally appeared to work. Also, since the Tv_t members are of type uint32_t, n should match that.

Here is a patch that should fix undefined behaviour and implement sleep inf on C99 and up. Please test:

diff --git a/src/cmd/ksh93/bltins/sleep.c b/src/cmd/ksh93/bltins/sleep.c
index e849ef44b..a899e8279 100644
--- a/src/cmd/ksh93/bltins/sleep.c
+++ b/src/cmd/ksh93/bltins/sleep.c
@@ -29,6 +29,7 @@
 #include   <error.h>
 #include   <errno.h>
 #include   <tmx.h>
+#include   <ast_float.h>
 #include   "builtins.h"
 #include   "FEATURE/time"
 #include   "FEATURE/poll"
@@ -140,9 +141,27 @@ skip:
  */
 void sh_delay(double t, int sflag)
 {
-   int n = (int)t;
+   uint32_t n;
    Tv_t ts, tx;

+#if _lib_fpclassify
+   switch (fpclassify(t))
+   {
+   case FP_NAN:
+       errormsg(SH_DICT,ERROR_exit(1),e_number,"NaN");
+       UNREACHABLE();
+   case FP_INFINITE:
+       ts.tv_sec = 0xFFFFFFFF;  /* uint32_t max */
+       ts.tv_nsec = 0;
+       while (1)
+       {
+           tvsleep(&ts, NULL);
+           if ((sh.trapnote & (SH_SIGSET | SH_SIGTRAP)) || sflag)
+               return;
+       }
+   }
+#endif
+   n = (uint32_t)t;
    ts.tv_sec = n;
    ts.tv_nsec = 1000000000 * (t - (double)n);
    while(tvsleep(&ts, &tx) < 0)
lzaoral commented 3 months ago

Thank you, @McDutchie! I confirm that your patch fixes the issue described in https://github.com/ksh93/ksh/pull/763#issuecomment-2249668652. clock_nanosleep is now called only once and with correct values!

$ strace arch/linux.arm64-64/bin/ksh -c 'sleep inf'
...
rt_sigaction(SIGWINCH, {sa_handler=0x44ee5c, sa_mask=[], sa_flags=SA_INTERRUPT}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGALRM, {sa_handler=0x44ee5c, sa_mask=[], sa_flags=SA_INTERRUPT}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=4294967295, tv_nsec=0},