LuteOrg / lute-v3

LUTE = Learning Using Texts: learn languages through reading. Python/Flask.
MIT License
423 stars 46 forks source link

Permanent 500 error if mecab is not installed and you click the Japanese demo book #65

Open Blauelf opened 9 months ago

Blauelf commented 9 months ago

Description

I installed the app, did some first exploration in the demo database, got stuck. The app basically no longer works, "home" results in HTTP 500.

To Reproduce

  1. Install the app via pip (but not mecab)
  2. Run the app, open the page in your browser
  3. click on the Japanese book
  4. Admire the error message/stack trace (see below)
  5. Click "Home" or restart the app as often as you want, always 500, no way to select a non-Japanese book

Screenshots

Traceback (most recent call last):
  File "/home/xxx/my_lute/myenv/lib/python3.10/site-packages/natto/mecab.py", line 401, in __parse_tonodes
    surf = self.__bytes2str(raws).strip(self._STRIP_WHITESPACE)
  File "/home/xxx/my_lute/myenv/lib/python3.10/site-packages/natto/support.py", line 17, in bytes2str
    return b.decode(enc)
UnicodeDecodeError: 'euc_jp' codec can't decode byte 0xa4 in position 4: incomplete multibyte sequence

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/xxx/my_lute/myenv/lib/python3.10/site-packages/flask/app.py", line 1455, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/xxx/my_lute/myenv/lib/python3.10/site-packages/flask/app.py", line 869, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/xxx/my_lute/myenv/lib/python3.10/site-packages/flask/app.py", line 867, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/xxx/my_lute/myenv/lib/python3.10/site-packages/flask/app.py", line 852, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/home/xxx/my_lute/myenv/lib/python3.10/site-packages/lute/app_factory.py", line 121, in index
    refresh_stats()
  File "/home/xxx/my_lute/myenv/lib/python3.10/site-packages/lute/book/stats.py", line 85, in refresh_stats
    stats = _get_stats(book)
  File "/home/xxx/my_lute/myenv/lib/python3.10/site-packages/lute/book/stats.py", line 98, in _get_stats
    status_distribution = get_status_distribution(book)
  File "/home/xxx/my_lute/myenv/lib/python3.10/site-packages/lute/book/stats.py", line 25, in get_status_distribution
    paras = [
  File "/home/xxx/my_lute/myenv/lib/python3.10/site-packages/lute/book/stats.py", line 26, in <listcomp>
    get_paragraphs(t)
  File "/home/xxx/my_lute/myenv/lib/python3.10/site-packages/lute/read/service.py", line 94, in get_paragraphs
    tokens = language.get_parsed_tokens(text.text)
  File "/home/xxx/my_lute/myenv/lib/python3.10/site-packages/lute/models/language.py", line 127, in get_parsed_tokens
    return self.parser.get_parsed_tokens(s, self)
  File "/home/xxx/my_lute/myenv/lib/python3.10/site-packages/lute/parse/mecab_parser.py", line 87, in get_parsed_tokens
    for n in nm.parse(para, as_nodes=True):
  File "/home/xxx/my_lute/myenv/lib/python3.10/site-packages/natto/mecab.py", line 431, in __parse_tonodes
    raise MeCabError(self.__bytes2str(self.__ffi.string(err)))
natto.api.MeCabError: MECAB_NBEST request type is not set

Extra software info, if not already included in the Description:

Ubuntu 22.04 in WSL2 Version: 3.0.6 (installed via pip) Python 3.10.12

Installing mecab solves the issue, but I'd prefer an error handling that does not brick the whole app, at least a way to switch back to the book selection.

jzohrab commented 9 months ago

Well that's disappointing. I thought this had been handled correctly with checking for invalid parsers. I did a check for this in github ci as well. I'll have to try this out again. Thanks for the issue.

jzohrab commented 9 months ago

NO idea what's happening here. Some users still see the JP parser when they don't have Mecab. Clearly the current method of "deactivating" the parser etc is not working. I guess this will need to go into a db table ??? on system startup, and the appropriate tests and options hidden depending on the state of the table.