python / cpython

The Python programming language
https://www.python.org
Other
63.11k stars 30.22k forks source link

when forking, buffered output is not flushed first. #61432

Closed 8ca6f4de-1a5f-4e3f-8fb4-88e4f2f60513 closed 11 years ago

8ca6f4de-1a5f-4e3f-8fb4-88e4f2f60513 commented 11 years ago
BPO 17230
Nosy @gpshead, @vstinner, @bitdancer
Files
  • test.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = created_at = labels = ['type-bug'] title = 'when forking, buffered output is not flushed first.' updated_at = user = 'https://bugs.python.org/jortbloem' ``` bugs.python.org fields: ```python activity = actor = 'gregory.p.smith' assignee = 'none' closed = True closed_date = closer = 'gregory.p.smith' components = ['None'] creation = creator = 'jort.bloem' dependencies = [] files = ['29119'] hgrepos = [] issue_num = 17230 keywords = [] message_count = 9.0 messages = ['182346', '182347', '182349', '182350', '182351', '182353', '182354', '182363', '182366'] nosy_count = 4.0 nosy_names = ['gregory.p.smith', 'vstinner', 'r.david.murray', 'jort.bloem'] pr_nums = [] priority = 'normal' resolution = 'wont fix' stage = None status = 'closed' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue17230' versions = ['Python 2.6', 'Python 2.7'] ```

    8ca6f4de-1a5f-4e3f-8fb4-88e4f2f60513 commented 11 years ago

    When calling os.fork() without a tty, a process reporting the parent's pid runs code BEFORE the fork().

    When running on a tty, it behaves as expected: both parent and child continue running from statement immediately after os.fork()

    See attached test.py, compare running it with tty vs. without.

    to run without tty: ssh localhost pwd/test.py

    vstinner commented 11 years ago

    There is no attached file.

    I don't think that it's a bug that the child starts before the parent. It depends on the OS, OS version and many other things. You should use a synchronization mechanism to ensure the execution order.

    bitdancer commented 11 years ago

    Haypo: I think he's saying that a statement in the source before the os.fork call is executed in the child, which seem rather unlikely. Your suggestion may be what is happening to confuse him into thinking that, but without the sample program we can't evaluate this any further.

    8ca6f4de-1a5f-4e3f-8fb4-88e4f2f60513 commented 11 years ago

    Try attachment again.

    8ca6f4de-1a5f-4e3f-8fb4-88e4f2f60513 commented 11 years ago

    haypo: I understand that, after a fork, parent and child instructions are run in parallel; which one prints first is a matter of chance.

    However, commands BEFORE THE FORK should not be re-run.

    See test script. I would expect one "Start \<pid>", followed by a "parent \<pid>" and a "child \<pid>". I would not expect to see (as I do) a second "Start \<pid>".

    (The original program was long and complex, with numerous forks; this is the smallest program I could write to show the problem).

    bitdancer commented 11 years ago

    I only see Start printed once (on linux). What OS are you running this on?

    vstinner commented 11 years ago

    I can reproduce the issue using "python test.py|cat". The problem is that sys.stdout is buffered and the buffer is flushed twice: once in the parent, once in the child. Just call sys.stdout.flush() before os.fork() should fix your issue.

    I don't think that Python should flush buffers of all streams before fork, so I propose to close this issue. Except if you see something interesting to add to Python documentation.

    8ca6f4de-1a5f-4e3f-8fb4-88e4f2f60513 commented 11 years ago

    I agree that it is reasonable NOT to flush stdout on fork().

    I don't think the outcome is reasonable.

    What about voiding all buffers after the fork for the child?

    gpshead commented 11 years ago

    os.fork() is a low level system call wrapper. Anyone using it needs to deal with flushing whatever buffers their application has before forking among many many other things. There is a reason it lives in the os module.

    It is already a dangerous system call to use from Python (ie: your child is likely to lock up if your parent had any threads). There really is nothing we can or should do to make it better.