Large pipe reads might fail on Mac OS (platform bug)

Originally reported by: John Ricklefs (Bitbucket: jdricklefs, GitHub: jdricklefs)

In my application, I'm doing essentially:

Create a duplex pipe
Using start_process, fork, then exec a new script.
In the exec'd child, manually construct a GIPCReader and wait for data from the parent.
In the parent, pickle over some dictionaries of text, then close the pipe.

On Linux, this is working just fine. On OSX, I'm getting frequent but somewhat inconsistent crashes where the child process gets an EINVAL on read(). I believe what is happening is the same issue in this Python bug (rejected on account of being an OS bug): http://bugs.python.org/issue15896

If I understand that bug correctly, reads on nonblocking pipes on OSX occasionally fail if the requested amount is greater than the buffer size.

I've attached a simple file that demonstrates this, as well as the simplest form of the fix (again, assuming that I understand the issue correctly). On my MacBook Air w/ Python 2.7.5, I usually get the following stacktrace at around the 338000 mark (give or take a few thousand).

#!python

335000: Starting child, fds 3 4
Got 335000
Traceback (most recent call last):
  File "einvalcrash.py", line 46, in <module>
    x = reader.get()
  File "/Users/jdricklefs/envs/qexec/lib/python2.7/site-packages/gipc/gipc.py", line 680, in get
    bindata = self._recv_in_buffer(msize).getvalue()
  File "/Users/jdricklefs/envs/qexec/lib/python2.7/site-packages/gipc/gipc.py", line 629, in _recv_in_buffer
    chunk = _read_nonblocking(self._fd, remaining)
  File "/Users/jdricklefs/envs/qexec/lib/python2.7/site-packages/gevent/os.py", line 50, in nb_read
    return _read(fd, n)
OSError: [Errno 22] Invalid argument
Traceback (most recent call last):
  File "einvalcrash.py", line 55, in <module>
    writer.put("x" * i)
  File "/Users/jdricklefs/envs/qexec/lib/python2.7/site-packages/gipc/gipc.py", line 748, in put
    self._write(struct.pack("!i", len(bindata)) + bindata)
  File "/Users/jdricklefs/envs/qexec/lib/python2.7/site-packages/gipc/gipc.py", line 716, in _write
    bytes_written = _write_nonblocking(self._fd, bindata)
  File "/Users/jdricklefs/envs/qexec/lib/python2.7/site-packages/gevent/os.py", line 69, in nb_write
    return _write(fd, buf)
OSError: [Errno 32] Broken pipe

If I patch out and use the DarwinSafeGIPCReader (which simply caps the os.read() call at up to 64k at a time), I am no longer able to reproduce the crash.

Let me know if you'd like me to prepare a pull request or if I can be of other assistance. Thanks!

Bitbucket: https://bitbucket.org/jgehrcke/gipc/issue/13

Original comment by John Ricklefs (Bitbucket: jdricklefs, GitHub: jdricklefs):

That's great news! Thanks so much for investigating and fixing.

Original comment by Jan-Philip Gehrcke (Bitbucket: jgehrcke, GitHub: jgehrcke):

Attempt to read at most 65536 bytes in read(2) syscall (see issue #13).

Original comment by Jan-Philip Gehrcke (Bitbucket: jgehrcke, GitHub: jgehrcke):

Hey John,

I agree that this is quite a nasty pitfall. When I said that people should learn the hard way what a Mac can do to them I was not entirely serious. On the other hand, I was actually considering that gipc is probably less frequently run on a private Mac than on a Linux server. However, all that should not be of highest priority when it comes to a decision with this issue. I was mainly guided by the principle that we do not want to work around OS bugs if the specifics of the bug are unclear to us (there is no official statement, and no official these-versions-are-affected-list).

Today I tried to quantify whether the alignment of the maximum read size with the usual Linux pipe capacity might have an impact on the overall messaging performance. Regarding pipe capacity, let me quote from http://man7.org/linux/man-pages/man7/pipe.7.html:

"Different implementations have different limits for the pipe capacity. Applications should not rely on a particular capacity [...]. In Linux versions before 2.6.11, the capacity of a pipe was the same as the system page size (e.g., 4096 bytes on i386). Since Linux 2.6.11, the pipe capacity is 65536 bytes.

I used the benchmarking tool examples/gipc_benchmark.py (in the repo), and tested the pipe throughput performance for different message sizes, for two cases:

before: current code

after: including

        if remaining > 65536:
            chunk = _read_nonblocking(self._fd, 65536)
        else:
            chunk = _read_nonblocking(self._fd, remaining)

This is the outcome:

read_max_size_comparison_table

Surprisingly, on both tested Linux versions (2.6.32 and 3.2.0), there is a significant performance gain (about 30 %) for large messages (512 kB). The code change adds one more condition in the read loop, manifesting itself in the small message throughput which drops by about 3 % -- an insignificant change in my opinion.

Looks like the tested Linux versions quite inefficiently handle the case when we request to read more from the pipe than it may contain (according to its capacity). I have currently no clue why that is. I would like to understand it, there must be good reasons, otherwise this is a flaw in the kernel (pretty sure that this inefficiency is not contained in gevent or Python, since both simply pass n through, until it ends up with the read() system call).

With this observation at hand, it is not difficult to decide that we should include this code change in all cases -- for performance reasons. At the same time, we "unfortunately" (:D) work around a mean Mac OS X bug.

With the code change in question, gipc does not "rely on a particular capacity" (cf. http://man7.org/linux/man-pages/man7/pipe.7.html), but implements a performance improvement on many operating systems I expect it to run on.

Original comment by John Ricklefs (Bitbucket: jdricklefs, GitHub: jdricklefs):

Thank you for gipc! Saves us a lot of hassle in our app :)

Apologies for the excess detail regarding the fork/exec part of my app - the bug is definitely just in OSX's read() function.

Anyone else running into that problem should find this issue here via the search engine of his/her choice, actually feel the pain that Mac OS produces in this case, and work around this issue by themselves, as you did.

I agree that that is a reasonable decision, though it's still a pretty thorny trap for app developers using gipc. The reason I'm using gipc is to abstract out a bunch of the pitfalls of combining gevent and forking, and this OSX bug means that I still have to go under the hood to make everything work correctly.

The fix that an app developer would need to make is almost surely going to be monkey-patching gipc to limit the read size over the pipe. (The alternative would be to not use pipes, which means not using gipc). Now the app developer now has to copy/paste the whole function, tweak a couple lines, then patch it back in. When/if gipc is then updated in some other way, this patching will likely have to be re-done.

Perhaps a pragmatic solution would be to add a module attribute MAX_READ_SIZE, defaulting to 0 (no limit), which can be trivially replaced by app developers to some other value if they're also stuck using gipc on OSX?

Guess this will teach me to use a closed-source operating system.

Thanks again!

Original comment by Jan-Philip Gehrcke (Bitbucket: jgehrcke, GitHub: jgehrcke):

First of all, thanks for using gipc, and thanks for the detailed feedback (nice write-up!). Sorry it took me longish to get back to you.

According to http://bugs.python.org/issue15896 you are right, this is an issue with the Darwin platform, as it does not adhere to POSIX standard in that case. The issue is unrelated to fork, it seems to be a bug in the read() system call on Mac OS X. Quoting from the Python bug tracker:

"So it really seems that under certain conditions, a non-blocking read from an empty pipe could fail with EINVAL instead of EAGAIN. But this is definitely a bug, in such cases read() should return EAGAIN."

In the named tracker the conclusion was

"It's not Python's job to workaround stupid platform bugs, or document them: that would lead to unmanagable code or unmaintanable documentation."

Since gipc is a library and not an application, I tend to follow the same approach. In general, we do not want to add hacks for avoiding platform bugs to a library. The first question for a programmatic workaround would be which versions of Mac OS exactly are affected by this -- even finding a reliable source for this information is too much of an effort I believe.

In conclusion, I would say it is great that you have identified this issue in the context of gipc, so that people are aware of this problem. Furthermore, you have provided the workaround (although you could check for buffersize > 127 kB according to http://bugs.python.org/msg170731 (the pipe capacity should however never be larger than 64 kB, see http://unix.stackexchange.com/q/11946)). Anyone else running into that problem should find this issue here via the search engine of his/her choice, actually feel the pain that Mac OS produces in this case, and work around this issue by themselves, as you did.

Would that be a reasonable decision in your opinion?

jgehrcke / gipc

Large pipe reads might fail on Mac OS (platform bug) #13