myint / language-check

Python wrapper for LanguageTool grammar checker
https://pypi.python.org/pypi/language-check
GNU Lesser General Public License v3.0
327 stars 101 forks source link

languagetool-server.jar freezes randomly #16

Closed sjlnk closed 8 years ago

sjlnk commented 9 years ago

LanguageTool server keeps on freezing randomly. The client never receives the HTTP request.

This is the state of the client at the time of frozen server:

Traceback (most recent call last):
  File "langtool.py", line 25, in <module>
    langtool.check(txt)
  File "/home/seb/.virtualenvs/p3/lib/python3.4/site-packages/language_check/__init__.py", line 240, in check
    root = self._get_root(self._url, self._encode(text, srctext))
  File "/home/seb/.virtualenvs/p3/lib/python3.4/site-packages/language_check/__init__.py", line 299, in _get_root
    with urlopen(url, data, cls._TIMEOUT) as f:
  File "/usr/lib/python3.4/urllib/request.py", line 161, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.4/urllib/request.py", line 463, in open
    response = self._open(req, data)
  File "/usr/lib/python3.4/urllib/request.py", line 481, in _open
    '_open', req)
  File "/usr/lib/python3.4/urllib/request.py", line 441, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.4/urllib/request.py", line 1210, in http_open
    return self.do_open(http.client.HTTPConnection, req)
  File "/usr/lib/python3.4/urllib/request.py", line 1185, in do_open
    r = h.getresponse()
  File "/usr/lib/python3.4/http/client.py", line 1171, in getresponse
    response.begin()
  File "/usr/lib/python3.4/http/client.py", line 351, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.4/http/client.py", line 313, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/usr/lib/python3.4/socket.py", line 374, in readinto
    return self._sock.recv_into(b)
KeyboardInterrupt

This is the error triggering sample script:

import language_check

langtool = language_check.LanguageTool("en-US")

txt = """
Of an intermediate balance, under the circumstances, there is no
possibility. The city has its cunning wiles, no less than the
infinitely smaller and more human tempter. There are large forces
which allure with all the soulfulness of expression possible in the
most cultured human. The gleam of a thousand lights is often as
effective as the persuasive light in a wooing and fascinating eye.
Half the undoing of the unsophisticated and natural mind is
accomplished by forces wholly superhuman. A blare of sound, a roar
of life, a vast array of human hives, appeal to the astonished
senses in equivocal terms. Without a counsellor at hand to whisper
cautious interpretations, what falsehoods may not these things breathe
into the unguarded ear! Unrecognised for what they are, their
beauty, like music, too often relaxes, then weakens, then perverts the
simpler human perceptions.
"""

for i in range(100000):
    print("\r{}".format(i), flush=True, end="")
    langtool.check(txt)

It normally freezes at around 900th-1000th iteration.

OS: Linux Mint Kernel: 3.13.0-24-generic Python: 3.4.3

java -version:

java version "1.8.0_60"
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)
sjlnk commented 9 years ago

A quick and dirty fix was to change LanguageTool._TIMEOUT to something smaller and put time.sleep(5) in between terminate and start commands to give the server some time to terminate.

This is the modified LanguageTool._get_root:

@classmethod
def _get_root(cls, url, data=None, num_tries=2):
    for n in range(num_tries):
        try:
            with urlopen(url, data, cls._TIMEOUT) as f:
                return ElementTree.parse(f).getroot()
        except (IOError, http.client.HTTPException) as e:
            cls._terminate_server()
            import time; time.sleep(5) # wait for a while for the server to properly terminate
            cls._start_server()
            if n + 1 >= num_tries:
                raise Error('{}: {}'.format(cls._url, e))

Obviously that hack is not a proper fix for the problem. That Java server should be ideally fixed. Is there any chance the communication with the server is flawed and causes these freezes?

myint commented 9 years ago

I can reproduce the problem using your example. On my OS X machine (with Java 1.6 and LanguageTool 2.2), I consistently see it freeze at iteration 950.

duhaime commented 8 years ago

I have also experienced this behavior, and have found the problem is compounded when attempting to use multiprocessing to detect the language of documents in multiple processes.