dvdotsenko / git_http_backend.py

Port of git's "Smart HTTP" (git-http-backend) server component to Python WSGI interface.
git_http_backend_python_wsgi
GNU Lesser General Public License v2.1
72 stars 15 forks source link

"Too many open files" because of a read pipe leak in SubprocessIOChunker #19

Open martindorey opened 5 years ago

martindorey commented 5 years ago

Our RhodeCode instance dies every so often when it runs out of file descriptors. It's leaking the read end of pipes. I found that the issue has been reported and already fixed in Kallithea, a fork of RhodeCode, by Mads Kiilerich, at https://bitbucket.org/conservancy/kallithea/issues/32. Here's a test case that demonstrates the problem, with one extra leaked pipe fd per iteration.

martind@swiftboat:~/download/git_http_backend.py$ cat fd-leak.py
#!/usr/bin/python
import subprocessio
import os
import sys

for ii in range(10):
    test = subprocessio.SubprocessIOChunker("true", sys.stdin)
    test.close()
    dir = "/proc/self/fd"
    print(map(lambda file: (oct(os.lstat(file).st_mode & 0o777) + ":" + os.readlink(file)) if os.path.islink(file) else file, map(lambda entry: dir + "/" + entry, os.listdir(dir))))
    print(len(os.listdir("/proc/self/task")))
martind@swiftboat:~/download/git_http_backend.py$ ./fd-leak.py < /dev/null
['0500:/dev/null', '0700:/dev/pts/1', '0700:/dev/pts/1', '0500:pipe:[96850305]', '/proc/self/fd/4']
1
['0500:/dev/null', '0700:/dev/pts/1', '0700:/dev/pts/1', '0500:pipe:[96850305]', '0500:pipe:[96852161]', '/proc/self/fd/5']
1
['0500:/dev/null', '0700:/dev/pts/1', '0700:/dev/pts/1', '0500:pipe:[96850305]', '0500:pipe:[96852161]', '0500:pipe:[96849113]', '/proc/self/fd/6']
1
['0500:/dev/null', '0700:/dev/pts/1', '0700:/dev/pts/1', '0500:pipe:[96850305]', '0500:pipe:[96852161]', '0500:pipe:[96849113]', '0500:pipe:[96849120]', '/proc/self/fd/7']
1
['0500:/dev/null', '0700:/dev/pts/1', '0700:/dev/pts/1', '0500:pipe:[96850305]', '0500:pipe:[96852161]', '0500:pipe:[96849113]', '0500:pipe:[96849120]', '0500:pipe:[96849127]', '/proc/self/fd/8']
1
['0500:/dev/null', '0700:/dev/pts/1', '0700:/dev/pts/1', '0500:pipe:[96850305]', '0500:pipe:[96852161]', '0500:pipe:[96849113]', '0500:pipe:[96849120]', '0500:pipe:[96849127]', '0500:pipe:[96850312]', '/proc/self/fd/9']
1
['0500:/dev/null', '0700:/dev/pts/1', '0700:/dev/pts/1', '0500:pipe:[96850305]', '0500:pipe:[96852161]', '0500:pipe:[96849113]', '0500:pipe:[96849120]', '0500:pipe:[96849127]', '0500:pipe:[96850312]', '0500:pipe:[96849134]', '/proc/self/fd/10']
1
['0500:/dev/null', '0700:/dev/pts/1', '0700:/dev/pts/1', '0500:pipe:[96850305]', '0500:pipe:[96852161]', '0500:pipe:[96849113]', '0500:pipe:[96849120]', '0500:pipe:[96849127]', '0500:pipe:[96850312]', '0500:pipe:[96849134]', '0500:pipe:[96849141]', '/proc/self/fd/11']
1
['0500:/dev/null', '0700:/dev/pts/1', '0700:/dev/pts/1', '0500:pipe:[96850305]', '0500:pipe:[96852161]', '0500:pipe:[96849113]', '0500:pipe:[96849120]', '0500:pipe:[96849127]', '0500:pipe:[96850312]', '0500:pipe:[96849134]', '0500:pipe:[96849141]', '0500:pipe:[96849148]', '/proc/self/fd/12']
1
['0500:/dev/null', '0700:/dev/pts/1', '0700:/dev/pts/1', '0500:pipe:[96850305]', '0500:pipe:[96852161]', '0500:pipe:[96849113]', '0500:pipe:[96849120]', '0500:pipe:[96849127]', '0500:pipe:[96850312]', '0500:pipe:[96849134]', '0500:pipe:[96849141]', '0500:pipe:[96849148]', '0500:pipe:[96849155]', '/proc/self/fd/13']
1
martind@swiftboat:~/download/git_http_backend.py$

If I apply Mads's patch:

martind@swiftboat:~/download/git_http_backend.py$ curl <https://bitbucket.org/conservancy/kallithea/commits/2f18ebfead0cd4bb2589d2fc734487eb14a499f6/raw> | todos | patch --binary
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   881  100   881    0     0   1241      0 --:--:-- --:--:-- --:--:--  1242
patching file subprocessio.py
Hunk #1 succeeded at 362 (offset -24 lines).
Hunk #2 succeeded at 389 (offset -25 lines).
martind@swiftboat:~/download/git_http_backend.py$

... then repeat the test, the number of fds is stable, showing that the leak is plugged:

martind@swiftboat:~/download/git_http_backend.py$ ./fd-leak.py < /dev/null
['0500:/dev/null', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3']
1
['0500:/dev/null', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3']
1
['0500:/dev/null', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3']
1
['0500:/dev/null', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3']
1
['0500:/dev/null', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3']
1
['0500:/dev/null', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3']
1
['0500:/dev/null', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3']
1
['0500:/dev/null', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3']
1
['0500:/dev/null', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3']
1
['0500:/dev/null', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3']
1
martind@swiftboat:~/download/git_http_backend.py$

Perhaps I should raise it in its own issue, but my coworker, Uma Parvathappa, discovered that applying the above fix on its own seems to make it easier to expose another frailty. If we create a file larger than the buffer_size:

martind@swiftboat:~/download/git_http_backend.py$ dd if=/dev/zero of=test.txt count=65 bs=1024
65+0 records in
65+0 records out
66560 bytes (67 kB) copied, 0.000401554 s, 166 MB/s
martind@swiftboat:~/download/git_http_backend.py$

... then repeat the previous test of the code with the Kallithea patch, passing it this file as input, then we see an intermittent leak of the write end of pipes along with threads (counted by the second line output for each iteration):

martind@swiftboat:~/download/git_http_backend.py$ ./fd-leak.py < test.txt
['0500:/home/martind/download/git_http_backend.py/test.txt', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3', '0300:pipe:[96886148]']
2
['0500:/home/martind/download/git_http_backend.py/test.txt', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3', '0300:pipe:[96886148]', '0300:pipe:[96886163]']
3
['0500:/home/martind/download/git_http_backend.py/test.txt', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3', '0300:pipe:[96886148]', '0300:pipe:[96886163]', '0300:pipe:[96889151]']
4
['0500:/home/martind/download/git_http_backend.py/test.txt', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3', '0300:pipe:[96886148]', '0300:pipe:[96886163]', '0300:pipe:[96889151]', '0300:pipe:[96887169]']
5
['0500:/home/martind/download/git_http_backend.py/test.txt', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3', '0300:pipe:[96886148]', '0300:pipe:[96886163]', '0300:pipe:[96889151]', '0300:pipe:[96887169]', '0300:pipe:[96887177]']
6
['0500:/home/martind/download/git_http_backend.py/test.txt', '0700:/dev/pts/1', '0700:/dev/pts/1', '0500:/usr/lib/python2.7/threading.py', '0300:pipe:[96886148]', '0300:pipe:[96886163]', '0300:pipe:[96889151]', '0300:pipe:[96887169]', '0300:pipe:[96887177]', '0300:pipe:[96887185]']
7
['0500:/home/martind/download/git_http_backend.py/test.txt', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3', '0300:pipe:[96886148]', '0300:pipe:[96886163]', '0300:pipe:[96889151]', '0300:pipe:[96887169]', '0300:pipe:[96887177]', '0300:pipe:[96887185]', '0500:/usr/lib/python2.7/threading.py', '0500:/usr/lib/python2.7/threading.py', '0500:/usr/lib/python2.7/threading.py', '0500:/usr/lib/python2.7/threading.py', '0500:/usr/lib/python2.7/threading.py']
7
['0500:/home/martind/download/git_http_backend.py/test.txt', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3', '0300:pipe:[96886148]', '0300:pipe:[96886163]', '0300:pipe:[96889151]', '0300:pipe:[96887169]', '0300:pipe:[96887177]', '0300:pipe:[96887185]', '0500:/usr/lib/python2.7/threading.py', '0500:/usr/lib/python2.7/threading.py', '0500:/usr/lib/python2.7/threading.py', '0500:/usr/lib/python2.7/threading.py', '0500:/usr/lib/python2.7/threading.py']
7
['0500:/home/martind/download/git_http_backend.py/test.txt', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3', '0300:pipe:[96886148]', '0300:pipe:[96886163]', '0300:pipe:[96889151]', '0300:pipe:[96887169]', '0300:pipe:[96887177]', '0300:pipe:[96887185]', '0500:/usr/lib/python2.7/threading.py', '0500:/usr/lib/python2.7/threading.py', '0500:/usr/lib/python2.7/threading.py', '0500:/usr/lib/python2.7/threading.py', '0500:/usr/lib/python2.7/threading.py']
7
['0500:/home/martind/download/git_http_backend.py/test.txt', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3', '0300:pipe:[96886148]', '0300:pipe:[96886163]', '0300:pipe:[96889151]', '0300:pipe:[96887169]', '0300:pipe:[96887177]', '0300:pipe:[96887185]', '0500:/usr/lib/python2.7/threading.py', '0500:/usr/lib/python2.7/threading.py', '0500:/usr/lib/python2.7/threading.py', '0500:/usr/lib/python2.7/threading.py', '0500:/usr/lib/python2.7/threading.py']
7
martind@swiftboat:~/download/git_http_backend.py$

If, however, we make StreamFeeder do its os.close(t) in a try... finally block, then we no longer leak, neither pipes nor threads:

martind@swiftboat:~/download/git_http_backend.py$ cat subprocessio.py.rej
--- subprocessio.py
+++ subprocessio.py
@@ -61,15 +61,17 @@ class StreamFeeder(threading.Thread):

     def run(self):
         t = self.writeiface
-        if self.bytes:
-            os.write(t, self.bytes)
-        else:
-            s = self.source
-            b = s.read(4096)
-            while b:
-                os.write(t, b)
+        try:
+            if self.bytes:
+                os.write(t, self.bytes)
+            else:
+                s = self.source
                 b = s.read(4096)
-        os.close(t)
+                while b:
+                    os.write(t, b)
+                    b = s.read(4096)
+        finally:
+            os.close(t)

     @property
     def output(self):
martind@swiftboat:~/download/git_http_backend.py$ todos subprocessio.py.rej
martind@swiftboat:~/download/git_http_backend.py$ patch --binary -p0 < subprocessio.py.rej
patching file subprocessio.py
martind@swiftboat:~/download/git_http_backend.py$ ./fd-leak.py < test.txt
['0500:/home/martind/download/git_http_backend.py/test.txt', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3']
2
['0500:/home/martind/download/git_http_backend.py/test.txt', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3', '0500:/usr/lib/python2.7/threading.py']
2
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "subprocessio.py", line 71, in run
    os.write(t, b)
OSError: [Errno 32] Broken pipe

['0500:/home/martind/download/git_http_backend.py/test.txt', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3']
1
['0500:/home/martind/download/git_http_backend.py/test.txt', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3']
1
['0500:/home/martind/download/git_http_backend.py/test.txt', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3']
1
['0500:/home/martind/download/git_http_backend.py/test.txt', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3']
1
['0500:/home/martind/download/git_http_backend.py/test.txt', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3']
1
['0500:/home/martind/download/git_http_backend.py/test.txt', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3']
1
['0500:/home/martind/download/git_http_backend.py/test.txt', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3']
1
['0500:/home/martind/download/git_http_backend.py/test.txt', '0700:/dev/pts/1', '0700:/dev/pts/1', '/proc/self/fd/3']
1
martind@swiftboat:~/download/git_http_backend.py$