flux-framework / flux-core

core services for the Flux resource management framework
GNU Lesser General Public License v3.0
167 stars 50 forks source link

flux-job attach not behaving nicely when detached from debugger #2599

Closed dongahn closed 4 years ago

dongahn commented 4 years ago

I launched a job under totalview control like

totalview -oldui -verbosity error --args flux mini run -N 2 -n 2 -o stop-tasks-in-exec ./sleep_print 360 5

Once the job started running, I group-detached from the job so totalview is completely detached both from the parallel program and flux-job itself and then I exited the tool. flux job attach process continued to print out the outputs until I entered something into the terminal. But once I typed in something into the controlling terminal, the system puts flux job attach into the stop state. Probably, SIGTSTP is raised, which caused this process to stop.

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
ahn1     14083  0.0  0.2  58156  5080 pts/0    T    01:50   0:00 flux-job attach 250646959751168

This is problematic, though, because it turned out this process doesn't show up as a job with the jobs command in the original shell (so that I could switch it back to the foreground). Sending SIGCONT doesn't continue it either.

I didn't see this issue on srun. So there must be some tricks that it uses... Maybe handle SIGTSTP more gracefully?

dongahn commented 4 years ago

I sent an email to @jdelsign to get his opinion about this. He may have seen this misbehavior from other resource managers.

From yesterday's discussion, we may be able to get around this issue by installing a signal handler for SIGTSTP within flux-job and ignore this signal. Unlike SIGSTOP, SIGTSTP should be maskable.

But if we ignore it, one concern would be stdin. We probably don't want a background process to access stdin interfacing with the foreground process.

grondo commented 4 years ago

@dongahn, I tried to reproduce this issue simply by running a job with some output and attaching to flux job attach with gdb, then detaching. I wasn't able to reproduce the issue in this manner. Am I obviously missing something here?

I also had no problem putting flux job attach in the background and bringing it back into the foreground.

dongahn commented 4 years ago

But if we ignore it, one concern would be stdin. We probably don't want a background process to access stdin interfacing with the foreground process.

Sorry, I spoke too early. The SIGTSTP signal handler essentially is so that the background process doesn't interface with stdin...

dongahn commented 4 years ago

@dongahn, I tried to reproduce this issue simply by running a job with some output and attaching to flux job attach with gdb, then detaching. I wasn't able to reproduce the issue in this manner. Am I obviously missing something here?

Hmmm. Ok. Let me try this w/ gdb as well. Maybe looking at the difference between totalview and gdb, the problem becomes more clear.

I also had no problem putting flux job attach in the background and bringing it back into the foreground.

Yes, fg/bg of the processes spawned at the shell should work fine.

The problem for me was when flux-job was forked and exec'ed by totalview. Presumably, this flux-job process doesn't belong to the process group of the top level shell. And when I detached the controlling totalview from this process, it became orphaned with its PPID becomes 1, which appeared to be outside the purview of job control at the top shell.

grondo commented 4 years ago

it became orphaned with its PPID becomes 1, which appeared to be outside the purview of job control at the top shell.

Ah, that is the difference then. I was running flux job attach from one shell and attaching gdb from another.

dongahn commented 4 years ago

From @jdelsign:

This one I'm not sure I understand 100%. I do know that job control in the shell (to arbitrate access to stdin) is a trick business. It's been a while since I looked at it in detail, but IIRC, the shell manages a "pid" property in the tty so that the kernel knows which process(es) currently own(s) the TTY. If a process tries to read from stdin but doesn't own the TTY at that moment, it is sent a SIGTSTP.

TotalView doesn't do anything special for job control, but maybe it should. Dunno.

grondo commented 4 years ago

I used the following python script to orphan the flux job attach process and try to reproduce the original issue reported above:

import os
import sys
if not os.fork():
    os.execvp(sys.argv[1], sys.argv[1:])
$ flux mini submit -n1 sh -c 'while :; do sleep 3; date; done'
242212667392
(flux-PYQbrh) grondo@fluke108:~/git/f.git$ python3 ./fork.py flux job attach 242212667392
(flux-PYQbrh) grondo@fluke108:~/git/f.git$ Wed Jan  8 11:59:52 PST 2020
Wed Jan  8 11:59:55 PST 2020
Wed Jan  8 11:59:58 PST 2020
Wed Jan  8 12:00:01 PST 2020
Wed Jan  8 12:00:04 PST 2020
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
grondo   18066  0.0  0.0  68484  2888 pts/5    S    12:00   0:00 /g/g0/grondo/git/f.git/src/cmd/.libs/lt-flux-job attach 242212667392

As you can see the process wasn't stopped, and was still able to write output to the terminal, though it wasn't processing input. The flux job attach process was reparented to init and couldn't be foregrounded.

So I still don't have a good reproducer for the TV issue.

When running this test with srun, were you able to foreground the srun process after detaching from TV?

dongahn commented 4 years ago

@grondo: You are almost there. While flux-job is printing to the terminal, type any character to the terminal and enter at the top shell so that SIGTSTP will be raised.

ahn1@5bd83ef62aec:/usr/src$ flux mini submit -n1 sh -c 'while :; do sleep 3; date; done'
28957474816
ahn1@5bd83ef62aec:/usr/src$ python3 ./fork.py flux job attach 28957474816
ahn1@5bd83ef62aec:/usr/src$ MPIR_being_debugged: 0
Wed Jan  8 20:53:46 UTC 2020
Wed Jan  8 20:53:49 UTC 2020
Wed Jan  8 20:53:52 UTC 2020

ahn1@5bd83ef62aec:/usr/src$ ps x
  PID TTY      STAT   TIME COMMAND
    1 pts/0    Ss     0:00 /bin/bash
 9927 pts/0    S      0:00 start -s 2
 9928 pts/0    Sl     0:00 /usr/libexec/flux/cmd/flux-broker --setattr=rundir=/tmp/flux-9927-gsYBCE --setattr=tbon.endpoint=ipc://%B/req
 9929 pts/0    Sl     0:00 /usr/libexec/flux/cmd/flux-broker --setattr=rundir=/tmp/flux-9927-gsYBCE --setattr=tbon.endpoint=ipc://%B/req
 9977 pts/0    S      0:00 /bin/bash
 9983 pts/0    S      0:00 /usr/libexec/flux/flux-shell 28957474816
 9984 pts/0    S      0:00 sh -c while :; do sleep 3; date; done
 9991 pts/0    T      0:00 job attach 28957474816
 9995 pts/0    S      0:00 sleep 3
 9996 pts/0    R+     0:00 ps x
grondo commented 4 years ago

That is essentially what I did above.

Just a hunch, have you reproduced this behavior outside of Docker on the Mac?

(flux-LquY9G):~/git/f.git$ flux mini submit -n1 sh -c 'while :; do sleep 3; date; done'
90764738560
(flux-LquY9G):~/git/f.git$ python3 ./fork.py flux job attach 90764738560
(flux-LquY9G):~/git/f.git$ Wed Jan  8 14:02:12 PST 2020
Wed Jan  8 14:02:15 PST 2020
Wed Jan  8 14:02:18 PST 2020
Wed Jan  8 14:02:21 PST 2020
ps auxw | grep attach
grondo   26066  0.0  0.0  68484  2888 pts/5    S    14:02   0:00 /g/g0/grondo/git/f.git/src/cmd/.libs/lt-flux-job attach 90764738560
grondo   26115  0.0  0.0   9096   668 pts/5    S+   14:02   0:00 grep attach
(flux-LquY9G):~/git/f.git$ Wed Jan  8 14:02:27 PST 2020
Wed Jan  8 14:02:30 PST 2020
Wed Jan  8 14:02:33 PST 2020
kill -INT 26066
flux-job: one more ctrl-C within 2s to cancel or ctrl-Z to detach
(flux-LquY9G):~/git/f.git$ Wed Jan  8 14:02:36 PST 2020
kill -INT 26066
28.270s: job.exception type=cancel severity=0 interrupted by ctrl-C
flux-job: task(s) exited with exit code 143
dongahn commented 4 years ago

Just a hunch, have you reproduced this behavior outside of Docker on the Mac?

I haven't tried this. But this maybe it. (Either Docker on Mac or the Linux kernel level being used in Ubuntu: Linux 5bd83ef62aec 4.9.184-linuxkit #1 SMP Tue Jul 2 22:58:16 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux)

grondo commented 4 years ago

I was able to reproduce this problem under Docker on both Linux and MacOS. Perhaps some difference in the way the pty works under docker run.

It turns out that flux job attach is being stopped by SIGTTIN. When you try to send SIGCONT to the stopped process, it continues, immediately tries to read from stdin and is sent SIGTTIN again.

I'm confused why the process is even trying to read from stdin -- it seems like a process in the "background" (i.e. without access to the tty) shouldn't register activity on the stdin file descriptor, but maybe I'm wrong about that.

In testing, setting SIGTTIN to SIG_IGN worked around this problem, though I also had to add some other minor fixes to stop watching stdin on the first error.

I'll try to post a proposed fix soon.

dongahn commented 4 years ago

Oh cool!

I'm confused why the process is even trying to read from stdin

Does flux-job attach have stdin forwarding?

it seems like a process in the "background" (i.e. without access to the tty) shouldn't register activity on the stdin file descriptor, but maybe I'm wrong about that.

Maybe a kernel difference or pseudo terminal semantics difference under docker run like you suspected.

setting SIGTTIN to SIG_IGN worked around this problem

How did you do this out of curiosity?

grondo commented 4 years ago

Does flux-job attach have stdin forwarding?

Yes.

How did you do this out of curiosity?

At first I tried adding a signal watcher for SIGTTIN in the hopes that we could shut down the stdin watcher from there. However, for some reason this didn't work. The handler was never called and instead the flux-job process spun on read() from stdin (getting an error from each iteration).

What worked better was to just add

signal (SIGTTIN, SIG_IGN);

In the flux-job attach code.

grondo commented 4 years ago

BTW, I learned while working on this issue that SIGTTOU is only generated when the terminal has the TOSTOP mode set (stty tostop). This is not the default. A process can also ignore SIGTTOU so that it is not stopped even when the TOSTOP terminal mode is active. Not sure if that makes sense for flux job attach.

dongahn commented 4 years ago

Not sure if that makes sense for flux job attach.

I think it makes sense to add this to make the command tostop-terminal-free.

grondo commented 4 years ago

Ok, I will propose a PR that ignores both SIGTTIN and SIGTTOU in flux-job attach, along with some other fixes I found along the way. Thanks!

dongahn commented 4 years ago

Thanks!