python / cpython

The Python programming language
https://www.python.org
Other
62.34k stars 29.94k forks source link

"universal newlines" subprocess support broken with select- and poll-based communicate() #56832

Open pitrou opened 13 years ago

pitrou commented 13 years ago
BPO 12623
Nosy @birkenfeld, @gpshead, @pitrou, @bitdancer, @asvetlov, @cjerdonek, @vadmium
Superseder
  • bpo-16903: subprocess.Popen.communicate with universal_newlines=True doesn't accept strings on 3.2
  • Files
  • issue12623_failing_test.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['tests', 'type-bug', 'library'] title = '"universal newlines" subprocess support broken with select- and poll-based communicate()' updated_at = user = 'https://github.com/pitrou' ``` bugs.python.org fields: ```python activity = actor = 'BreamoreBoy' assignee = 'none' closed = False closed_date = None closer = None components = ['Library (Lib)', 'Tests'] creation = creator = 'pitrou' dependencies = [] files = ['26837'] hgrepos = [] issue_num = 12623 keywords = ['patch'] message_count = 17.0 messages = ['141014', '168317', '168320', '168322', '168325', '168328', '168329', '168330', '168331', '168333', '168335', '168336', '168340', '168557', '169046', '220743', '268346'] nosy_count = 7.0 nosy_names = ['georg.brandl', 'gregory.p.smith', 'pitrou', 'r.david.murray', 'asvetlov', 'chris.jerdonek', 'martin.panter'] pr_nums = [] priority = 'normal' resolution = 'out of date' stage = 'patch review' status = 'open' superseder = '16903' type = 'behavior' url = 'https://bugs.python.org/issue12623' versions = ['Python 3.4', 'Python 3.5'] ```

    pitrou commented 13 years ago

    The select() and poll() loop implementations of Popen.communicate() call os.write() instead of the write() method on the stdin pipe, meaning any newline translation *and* unicode-to-bytes encoding step is skipped.

    To use the write() method on the stdin pipe, we may have to set the file descriptor in non-blocking mode, especially given that _PIPE_BUF worth of characters can amount to more than _PIPE_BUF bytes on the underlying raw fd.

    See bpo-12591 for a simpler issue that was fixed.

    asvetlov commented 12 years ago

    Antoine, can you explain why subprocess support for universal_newlines is broken?

    As I can see tests for universal_newlines passed and these looks correct.

    In general I like your idea to get rid of os.write, but maybe that patch should be landed in 3.4?

    If not — I will prepare fix ASAP.

    cjerdonek commented 12 years ago

    Andrew, I'm not sure if this is the issue, but it seems like the only tests in which input is passed to communicate() with universal newlines is when stdin is the only PIPE, i.e.:

        def test_universal_newlines_communicate_stdin(self):
            # universal newlines through communicate(), with only stdin

    IIRC, the select() and poll() implementations are only called when at least two of the streams are PIPE (like in the new input_none test I added recently). Have you tried passing input to communicate() with at least two pipes (e.g. stdin and stdout)?

    pitrou commented 12 years ago

    As Chris said. Look at the POSIX version of _communicate(), nowhere is input given the newlines/encoding treatment.

    asvetlov commented 12 years ago

    I've pushed test for piping stdin, stdout and and stderr: 4af2c0687970 What other test we should to add?

    asvetlov commented 12 years ago

    Оор. I see. Pushing to communicate input with "\r" (see attached patch) produces the error.

    Will work on fixing.

    Thanks.

    asvetlov commented 12 years ago

    I like to set status for the issue to "Release blocker" if Georg Brandl agree with that.

    pitrou commented 12 years ago

    Оор. I see. Pushing to communicate input with "\r" (see attached patch) produces the error.

    Hmm, no, it's the reverse. Pushing input with "\n" should produce b"\r\n" on the other side, under Windows.

    pitrou commented 12 years ago

    Well, it's not a regression and it may be slightly delicate, so I don't think it should be a release blocker.

    asvetlov commented 12 years ago

    The main question: can be fix applied to 3.3 or it can wait for 3.4? 3.2 has the same problem BTW.

    cjerdonek commented 12 years ago

    Pushing to communicate input with "\r" (see attached patch) produces the error.

    Is this a supported use case? In universal newlines, stdin line endings are supposed to be "\n". From the subprocess documentation: "For stdin, line ending characters '\n' in the input will be converted to the default line separator os.linesep."

    asvetlov commented 12 years ago

    The real test should to put '\n' and subprocess should get '\r\n', right? Looking the code this test will fail on Windows. I cannot check it just now but looks like this is the problem.

    cjerdonek commented 12 years ago

    Looking the code this test will fail on Windows. I cannot check it just now but looks like this is the problem.

    Would it be possible to do something like the following to check this on a non-Windows machine (since the Python implementation of io respects the mutability of os.linesep but perhaps not the C version)? It seems so.

    >>> import _pyio, io, os
    >>> io.TextIOWrapper = _pyio.TextIOWrapper
    >>> os.linesep = "\r\n"
    etc.
    birkenfeld commented 12 years ago

    The main question: can be fix applied to 3.3 or it can wait for 3.4? 3.2 has the same problem BTW.

    I don't see any fix attached :)

    If it is a bug, it can be fixed before 3.3rc1, no need for release blocker status.

    asvetlov commented 12 years ago

    I like to leave fixes to 3.4. Any change can produce side-effects, which can be nightmare for upcoming release candidate. Sure, Georg will share my opinion. Though absence '\n' -> '\r\n' for input if OS is Windows and universal_newlines=True is not good.

    83d2e70e-e599-4a04-b820-3814bbdb9bef commented 10 years ago

    Can we have an update on this please, I'm assuming that the priority should now be marked as normal?

    vadmium commented 8 years ago

    . Summary: There was originally a bug, but it has been fixed. At best, we leave this open to work on including Andrew’s patch.

    Andrew’s patch adds a modified copy of test_universal_newlines_communicate_stdin(). But it does not look correct, and would fail on Windows. It needs to avoid decoding newlines in the child, like in revision 150fa296f5b9. IMO the existing test_universal_newlines_communicate_stdin() test should also be adjusted to verify that os.linesep was sent to the child process.

    Also, I don’t understand the SETBINARY business. Aren’t stdin etc set to binary mode by default in Python 3? Yes, it would be required for Python 2 compatibility, but if this patch is only for Python 3, why do we need it?

    Anyway, Antoine opened this bug specifically about the “select- and poll-based” implementation (now based on the new selectors module). That implementation is only used with multiple pipes. So I don’t see how the patch is relevant to the original issue (although it may be worthwhile updating and adding anyway).

    Regarding Antoine’s original report, we now do encode text strings to bytes. This was fixed as a side effect of revision c4a0fa6e687c in 3.3 (added timeout=... parameter), and directly in 3.2 by bpo-16903.

    As for newline translation, I’m not sure if that is really relevant. The selectors implementation is only used if sys.platform != "win32", while os.linesep translation only needs to happen when 'posix' not in sys.builtin_module_names. I suspect in all cases where the selectors implementation is used, and os.linesep is "\n", so it is not actually a bug.