python / cpython

The Python programming language
https://www.python.org
Other
63.37k stars 30.33k forks source link

test_input_tty hangs when run multiple times in the same process on macOS 10.15 #89050

Open ambv opened 3 years ago

ambv commented 3 years ago
BPO 44887
Nosy @vstinner, @ambv, @Fidget-Spinner, @akulakov

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['3.8', '3.9', '3.10', '3.11'] title = 'test_input_tty hangs when run multiple times in the same process on macOS 10.15' updated_at = user = 'https://github.com/ambv' ``` bugs.python.org fields: ```python activity = actor = 'andrei.avk' assignee = 'none' closed = False closed_date = None closer = None components = [] creation = creator = 'lukasz.langa' dependencies = [] files = [] hgrepos = [] issue_num = 44887 keywords = [] message_count = 10.0 messages = ['399380', '399382', '399384', '399385', '399396', '399398', '399412', '399413', '400631', '406262'] nosy_count = 4.0 nosy_names = ['vstinner', 'lukasz.langa', 'kj', 'andrei.avk'] pr_nums = [] priority = 'low' resolution = None stage = 'test needed' status = 'open' superseder = None type = None url = 'https://bugs.python.org/issue44887' versions = ['Python 3.8', 'Python 3.9', 'Python 3.10', 'Python 3.11'] ```

ambv commented 3 years ago

(I'm still investigating at the moment whether something changed in my environment.)

Running the following right now hangs on test_input_tty for me:

./python.exe -m test test_builtin test_builtin -v

This fails on all branches up to and including 3.7, so I assume this is environment-specific unless it's a regression due to a change that was backported all the way back to 3.7, which is out of the question as the last functional commit on 3.7 was back in June.

Things I tried so far:

The test in question is using deadline if available and sysconfig.get_config_vars()['HAVE_LIBREADLINE'] returns 1. I'll be trying to check if that works for me next.

ambv commented 3 years ago

Hynek confirmed on Big Sur with Python 3.9.5 from asdf that test_input_tty hangs, too, if ran for the second time in the same process.

Moreover, readline is not it. First of all, it's libedit on macOS:

❯ ll /usr/lib/libreadline.dylib lrwxr-xr-x 1 root wheel 15B Feb 2 2020 /usr/lib/libreadline.dylib -> libedit.3.dylib

So Python uses that by default:
>>> import readline
>>> readline._READLINE_LIBRARY_VERSION
'EditLine wrapper'
>>> readline._READLINE_RUNTIME_VERSION
1026
>>> readline._READLINE_VERSION
1026

Unless you instruct it to use readline (for example by providing "-I$(brew --prefix readline)/include" to CFLAGS and "-L$(brew --prefix readline)/lib" to LDFLAGS before running ./configure):
>>> import readline
>>> readline._READLINE_LIBRARY_VERSION
'8.1'
>>> readline._READLINE_RUNTIME_VERSION
2049
>>> readline._READLINE_VERSION
2049

The hang is the same in both cases.

Next course of action, checking if it's not due to fork shenanigans in _run_child():

https://github.com/python/cpython/blob/1841c70f2bdab9d29c1c74a8afffa45d5555af98/Lib/test/test_builtin.py#L2001

ambv commented 3 years ago

Parent process hangs on:

where "name" in frame #7 is the "readinto" method of \<_io.FileIO name=3 mode='rb' closefd=True\> and "arg" is \<memory at 0x102f5d090>.

Child process hangs on:

ambv commented 3 years ago

This might be a long-standing problem. I haven't encountered it before because I was always running -R: with -j and in this case the test is skipped:

test_input_tty (test.test_builtin.PtyTests) ... skipped 'stdin and stdout must be ttys'

ambv commented 3 years ago

Amazingly, excluding every other test function with a bunch of -i patterns still makes it hang when ran twice. On the other hand, only including the test function with -m works fine.

This is very weird. Looking further.

Semi-relatedly, I found BPO-26228, could reproduce it, and finished an open PR on it. While those are separate issues, I'm hoping to solve them both.

ambv commented 3 years ago

I found the high-level reason why test_builtin hangs: it runs doctests as well. What's the root cause? I don't know yet.

But to confirm, I can also hang the tests by running:

$ python3.9 -m test test_doctest test_builtin -v

Now to discover what it is that doctest does...

ambv commented 3 years ago

The doctest runner sets an output redirecting debugger, which subclasses Pdb, around actually running the doctest. This action causes the hang. New finding, we can hang the test with test_pdb too:

$ python3.9 -m test test_pdb test_builtin -v
ambv commented 3 years ago

It *is* readline-related after all O_O

Commenting out this section in Pdb.__init__ makes the issue go away: https://github.com/python/cpython/blob/64a7812c170f5d46ef16a1517afddc7cd92c5240/Lib/pdb.py#L234-L239

time ./python.exe -E -Wd -m test test_builtin test_builtin 0:00:00 load avg: 2.12 Run tests sequentially 0:00:00 load avg: 2.12 [1/2] test_builtin 0:00:00 load avg: 2.12 [2/2] test_builtin

== Tests result: SUCCESS ==

All 2 tests OK.

Total duration: 1.3 sec Tests result: SUCCESS 1.56 real 1.42 user 0.10 sys

I'll be continuing on this tomorrow to find the root cause.

vstinner commented 3 years ago

Is it related to https://bugs.python.org/issue41034 ?

akulakov commented 2 years ago

I've looked into this and the hang happens on this line:

https://github.com/python/cpython/blob/de3db1448b1b983eeb9f4498d07e3d2f1fb6d29d/Lib/test/test_builtin.py#L2030

So the issue is that on the second run, there's nothing to read on that fd. I've tried using os.stat to check if there's data on the fd, but it returned 0 data in both 1st and 2nd runs.

However, if a small sleep is added before running os.stat, it does return size of data on 1st run and returns 0 on 2nd run, meaning it's possible to avoid the hang and error out instead (is that an improvement?)

This is on MacOS 11.4 Big Sur by the way.

This is my test debug branch:

https://github.com/python/cpython/compare/main...akulakov:Test-check_input_tty-FIX?expand=1

gvanrossum commented 11 months ago

Still seen on Ventura 13.5.2 in Python 3.13 (i.e., the main branch). Would it be possible to get fix?

vstinner commented 11 months ago

test_builtin.test_input_tty() calls forkpty(), whereas recent macOS versions are known to have issues with fork().

forkpty() manual page:

The forkpty() function combines openpty(), fork(2), and login_tty() to create a new process operating in a pseudoterminal.

Maybe the test should be rewritten with openpty() and login_tty(), but replace fork() with subprocess.Popen()?

glibc implementation:

int
__forkpty (int *pptmx, char *name, const struct termios *termp,
       const struct winsize *winp)
{
  int ptmx, terminal, pid;

  if (openpty (&ptmx, &terminal, name, termp, winp) == -1)
    return -1;

  switch (pid = __fork ())
    {
    case -1:
      __close (ptmx);
      __close (terminal);
      return -1;

    case 0:
      /* Child.  */
      __close (ptmx);
      if (login_tty (terminal))
    _exit (1);

      return 0;

    default:
      /* Parent.  */
      *pptmx = ptmx;
      __close (terminal);

      return pid;
    }
}
sobolevn commented 5 months ago

It happens for me now every time I run ./python.exe -m test test_builtin -v

test_input_no_stdout_fileno (test.test_builtin.PtyTests.test_input_no_stdout_fileno) ... ok
test_input_tty (test.test_builtin.PtyTests.test_input_tty) ... ^C

== Tests result: INTERRUPTED ==

1 test omitted:
    test_builtin

Test suite interrupted by signal SIGINT.

Total duration: 37.5 sec
Total tests: run=0
Total test files: run=0/1
Result: INTERRUPTED

Even ./python.exe -m test test_builtin -v -m test_input_tty hangs:

» ./python.exe -m test test_builtin -v -m test_input_tty
== CPython 3.14.0a0 (heads/readydefault:4cb2d40830f, May 10 2024, 15:38:37) [Clang 15.0.0 (clang-1500.3.9.4)]
== macOS-14.4.1-arm64-arm-64bit-Mach-O little-endian
== Python build: debug
== cwd: /Users/sobolev/Desktop/cpython2/build/test_python_worker_94096æ
== CPU count: 12
== encodings: locale=UTF-8 FS=utf-8
== resources: all test resources are disabled, use -u option to unskip tests

Using random seed: 3295473582
0:00:00 load avg: 1.55 Run 1 test sequentially
0:00:00 load avg: 1.55 [1/1] test_builtin
test_input_tty (test.test_builtin.PtyTests.test_input_tty) ... 

😢

Is it related to a new REPL?