apertium / apertium-apy

📦 Apertium HTTP Server in Python
https://wiki.apertium.org/wiki/Apertium-apy
GNU General Public License v3.0
32 stars 42 forks source link

`Too many open files` when individual pipelines are restarted many times #88

Open unhammer opened 6 years ago

unhammer commented 6 years ago
INFO:root:A pipe for pair sme-sme_spell has handled 200 requests, scheduling restart
INFO:root:sme-sme_spell not in pipelines of this process
INFO:root:Starting up a new pipeline for sme-sme_spell <E2><80><A6>
ERROR:tornado.application:Future <tornado.concurrent.Future object at 0x102670ef0> exception was never retrieved: Traceback (most recent call last):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tornado/gen.py", line 1063, in run
    yielded = self.gen.throw(*exc_info)
  File "/Users/wwserver1/divvun/apertium-apy/servlet.py", line 995, in get
    reformat=False)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tornado/gen.py", line 1055, in run
    value = future.result()
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tornado/concurrent.py", line 238, in result
    raise_exc_info(self._exc_info)
  File "<string>", line 4, in raise_exc_info
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tornado/gen.py", line 1063, in run
    yielded = self.gen.throw(*exc_info)
  File "/Users/wwserver1/divvun/apertium-apy/servlet.py", line 503, in translateAndRespond
    translated = yield pipeline.translate(toTranslate, nosplit, deformat, reformat)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tornado/gen.py", line 1055, in run
    value = future.result()
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tornado/concurrent.py", line 238, in result
    raise_exc_info(self._exc_info)
  File "<string>", line 4, in raise_exc_info
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tornado/gen.py", line 1063, in run
    yielded = self.gen.throw(*exc_info)
  File "/Users/wwserver1/divvun/apertium-apy/translation.py", line 80, in translate
    for part in all_split]
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tornado/gen.py", line 1055, in run
    value = future.result()
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tornado/concurrent.py", line 238, in result
    raise_exc_info(self._exc_info)
  File "<string>", line 4, in raise_exc_info
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tornado/gen.py", line 828, in callback
    result_list.append(f.result())
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tornado/concurrent.py", line 238, in result
    raise_exc_info(self._exc_info)
  File "<string>", line 4, in raise_exc_info
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tornado/gen.py", line 1069, in run
    yielded = self.gen.send(value)
  File "/Users/wwserver1/divvun/apertium-apy/translation.py", line 285, in translateNULFlush
    proc_deformat = Popen(deformat, stdin=PIPE, stdout=PIPE)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/subprocess.py", line 676, in __init__
    restore_signals, start_new_session)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/subprocess.py", line 1185, in _execute_child
    errpipe_read, errpipe_write = os.pipe()
OSError: [Errno 24] Too many open files

This is on a server where language data is updated nightly, and pipes are restarted every 200 requests / on 3600 idle secs.

Maybe APy isn't correctly closing files when restarting pipelines?

unhammer commented 6 years ago

Seems ulimit -n was 256 on that server, so increasing might help, but restarting APy also helped so it's presumably keeping around too much stuff.

unhammer commented 6 years ago

Not just mac, this also happens on oqaa.

unhammer commented 6 years ago

What happened after 22c2995 was that a bunch of pipeline restarts would happen in a row. That should not lead to left-over open files, so there's still a bug, but at least it's easy to reproduce now, just check out 22c2995 and run with -r1 and submit a bunch of requests.