jupyterlab-contrib / spellchecker

Spellchecker for JupyterLab notebook markdown cells and file editor.
BSD 3-Clause "New" or "Revised" License
202 stars 20 forks source link

spellchecker tries to use an incorrect dictionary and does not work. #94

Closed rcrehuet closed 2 years ago

rcrehuet commented 2 years ago

Description

There are several similar issues but I could not find the solution in any of them. Most of them were installation problems, but my extensions seem to be properly installed.

When starting jupyterlab with the spellchecker extension, this message is printed:

[I 2021-10-08 10:20:40.938 ServerApp] Looking for hunspell dictionaries for spellchecker in ['/home/ramon/.local/share/jupyter/dictionaries', '/home/ramon/anaconda3/share/jupyter/dictionaries', '/usr/local/share/jupyter/dictionaries', '/usr/share/jupyter/dictionaries', '/usr/share/hunspell', '/usr/share/myspell', '/usr/share/myspell/dicts']
[W 2021-10-08 10:20:41.018 ServerApp] jupyterlab_spellchecker | extension failed loading with message: [Errno 2] No such file or directory: '/home/ramon/anaconda3/lib/python3.7/site-packages/babel/locale-data/ca_ES_valencia.dat'

I live in Catalonia and I may have set ca_ES somewhere, but my locale is in English, and this is the language I want to use in jupyterlab:

(base) ramon@xoriguer:~$ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC=en_GB.UTF-8
LC_TIME=en_GB.UTF-8
LC_COLLATE="en_US.UTF-8"
LC_MONETARY=en_GB.UTF-8
LC_MESSAGES="en_US.UTF-8"
LC_PAPER=en_GB.UTF-8
LC_NAME=en_GB.UTF-8
LC_ADDRESS=en_GB.UTF-8
LC_TELEPHONE=en_GB.UTF-8
LC_MEASUREMENT=en_GB.UTF-8
LC_IDENTIFICATION=en_GB.UTF-8
LC_ALL=

The jupyter lab shows English as the only available language (but it is shaded): image

However the spell-checker does not work as it does not highlight incorrect words.

I've checked the advanced settings and English is selected there. I do not see where ca_ES_valencia comes from: image

Reproduce

I am using conda

  1. Start jupyterlab with: (base) ramon@xoriguer:~$ jupyter lab

Expected behavior

The English dictionary should be used by loading the file en_US.dat which is located here: /home/ramon/anaconda3/lib/python3.7/site-packages/babel/locale-data/en_US.dat This file came with the spell-checker installation. I did not install it manually.

Context

(base) ramon@xoriguer:~$ conda list jupyterlab-spellchecker
# packages in environment at /home/ramon/anaconda3:
#
# Name                    Version                   Build  Channel
jupyterlab-spellchecker   0.7.1              pyhd8ed1ab_0    conda-forge

Config dir: /home/ramon/anaconda3/etc/jupyter jupyterlab enabled

Config dir: /usr/local/etc/jupyter


- Operating System and its version:  Ubuntu 20.04.1 LTS
- 
- Browser and its version: Firefox 81.0

<details><summary>Command Line Output</summary>
<pre>
Paste the output from your command line running `jupyter lab` here, use `--debug` if possible.
</pre>

[D 2021-10-08 10:52:01.878 ServerApp] Searching ['/home/ramon', '/home/ramon/.jupyter', '/home/ramon/anaconda3/etc/jupyter', '/usr/local/etc/jupyter', '/etc/jupyter'] for config files [D 2021-10-08 10:52:01.878 ServerApp] Looking for jupyter_config in /etc/jupyter [D 2021-10-08 10:52:01.879 ServerApp] Looking for jupyter_config in /usr/local/etc/jupyter [D 2021-10-08 10:52:01.879 ServerApp] Looking for jupyter_config in /home/ramon/anaconda3/etc/jupyter [D 2021-10-08 10:52:01.879 ServerApp] Looking for jupyter_config in /home/ramon/.jupyter [D 2021-10-08 10:52:01.879 ServerApp] Looking for jupyter_config in /home/ramon [D 2021-10-08 10:52:01.880 ServerApp] Looking for jupyter_server_config in /etc/jupyter [D 2021-10-08 10:52:01.880 ServerApp] Looking for jupyter_server_config in /usr/local/etc/jupyter [D 2021-10-08 10:52:01.880 ServerApp] Looking for jupyter_server_config in /home/ramon/anaconda3/etc/jupyter [D 2021-10-08 10:52:01.880 ServerApp] Looking for jupyter_server_config in /home/ramon/.jupyter [D 2021-10-08 10:52:01.881 ServerApp] Loaded config file: /home/ramon/.jupyter/jupyter_server_config.py [D 2021-10-08 10:52:01.881 ServerApp] Loaded config file: /home/ramon/.jupyter/jupyter_server_config.json [D 2021-10-08 10:52:01.881 ServerApp] Looking for jupyter_server_config in /home/ramon [D 2021-10-08 10:52:01.884 ServerApp] Paths used for configuration of jupyter_server_config: /etc/jupyter/jupyter_server_config.json [D 2021-10-08 10:52:01.884 ServerApp] Paths used for configuration of jupyter_server_config: /usr/local/etc/jupyter/jupyter_server_config.json [D 2021-10-08 10:52:01.885 ServerApp] Paths used for configuration of jupyter_server_config: /home/ramon/anaconda3/etc/jupyter/jupyter_server_config.d/jupyterlab.json /home/ramon/anaconda3/etc/jupyter/jupyter_server_config.d/jupyterlab_spellchecker.json /home/ramon/anaconda3/etc/jupyter/jupyter_server_config.d/nbclassic.json /home/ramon/anaconda3/etc/jupyter/jupyter_server_config.json [D 2021-10-08 10:52:01.885 ServerApp] Paths used for configuration of jupyter_server_config: /home/ramon/.jupyter/jupyter_server_config.json [D 2021-10-08 10:52:01.902 LabApp] Config changed: {'NotebookApp': {}, 'ServerApp': {'log_level': 'DEBUG', 'ip': '', 'password': 'sha1:eb37537254f0:e0b3a2c97c859ad619fe9cf771059a2dbbc2706e', 'jpserver_extensions': <LazyConfigValue {'update': {'jupyterlab_spellchecker': True, 'nbclassic': True}}>}, 'ExtensionApp': {'log_level': 'DEBUG'}} [I 2021-10-08 10:52:01.903 ServerApp] jupyterlab | extension was successfully linked. [I 2021-10-08 10:52:01.903 ServerApp] jupyterlab_spellchecker | extension was successfully linked. [D 2021-10-08 10:52:01.914 NotebookApp] Config changed: {'NotebookApp': {'open_browser': False}, 'ServerApp': {'log_level': 'DEBUG', 'ip': '', 'password': 'sha1:eb37537254f0:e0b3a2c97c859ad619fe9cf771059a2dbbc2706e', 'jpserver_extensions': <LazyConfigValue value={'jupyterlab': True, 'jupyterlab_spellchecker': True, 'nbclassic': True}>}, 'ExtensionApp': {'log_level': 'DEBUG'}} [D 2021-10-08 10:52:02.130 ServerApp] Paths used for configuration of jupyter_notebook_config: /home/ramon/.jupyter/jupyter_notebook_config.json [D 2021-10-08 10:52:02.131 ServerApp] Paths used for configuration of jupyter_notebook_config: /etc/jupyter/jupyter_notebook_config.json [D 2021-10-08 10:52:02.132 ServerApp] Paths used for configuration of jupyter_notebook_config: /usr/local/etc/jupyter/jupyter_notebook_config.json [D 2021-10-08 10:52:02.133 ServerApp] Paths used for configuration of jupyter_notebook_config: /home/ramon/anaconda3/etc/jupyter/jupyter_notebook_config.d/jupyterlab.json /home/ramon/anaconda3/etc/jupyter/jupyter_notebook_config.d/jupyterlab_spellchecker.json /home/ramon/anaconda3/etc/jupyter/jupyter_notebook_config.json [D 2021-10-08 10:52:02.134 ServerApp] Paths used for configuration of jupyter_notebook_config: /home/ramon/.jupyter/jupyter_notebook_config.json [I 2021-10-08 10:52:02.134 ServerApp] nbclassic | extension was successfully linked. [D 2021-10-08 10:52:02.136 ServerApp] Config changed: {'ExtensionApp': {'log_level': 'DEBUG'}, 'NotebookApp': {'open_browser': False}, 'ServerApp': {'log_level': 'DEBUG', 'ip': '*', 'password': 'sha1:eb37537254f0:e0b3a2c97c859ad619fe9cf771059a2dbbc2706e', 'jpserver_extensions': <LazyConfigValue value={'jupyterlab': True, 'jupyterlab_spellchecker': True, 'nbclassic': True}>}} [D 2021-10-08 10:52:02.136 ServerApp] Raising open file limit: soft 1024->4096; hard 1048576->1048576 [W 2021-10-08 10:52:02.151 ServerApp] WARNING: The Jupyter server is listening on all IP addresses and not using encryption. This is not recommended. [I 2021-10-08 10:52:02.156 ServerApp] nbclassic | extension was successfully loaded. [I 2021-10-08 10:52:02.157 LabApp] JupyterLab extension loaded from /home/ramon/anaconda3/lib/python3.7/site-packages/jupyterlab [I 2021-10-08 10:52:02.157 LabApp] JupyterLab application directory is /home/ramon/anaconda3/share/jupyter/lab [I 2021-10-08 10:52:02.160 ServerApp] jupyterlab | extension was successfully loaded. [I 2021-10-08 10:52:02.161 ServerApp] Looking for hunspell dictionaries for spellchecker in ['/home/ramon/.local/share/jupyter/dictionaries', '/home/ramon/anaconda3/share/jupyter/dictionaries', '/usr/local/share/jupyter/dictionaries', '/usr/share/jupyter/dictionaries', '/usr/share/hunspell', '/usr/share/myspell', '/usr/share/myspell/dicts'] [D 2021-10-08 10:52:02.237 ServerApp] Traceback (most recent call last): File "/home/ramon/anaconda3/lib/python3.7/site-packages/jupyter_server/extension/manager.py", line 356, in load_extension extension.load_all_points(self.serverapp) File "/home/ramon/anaconda3/lib/python3.7/site-packages/jupyter_server/extension/manager.py", line 236, in load_all_points return [self.load_point(point_name, serverapp) for point_name in self.extension_points] File "/home/ramon/anaconda3/lib/python3.7/site-packages/jupyter_server/extension/manager.py", line 236, in return [self.load_point(point_name, serverapp) for point_name in self.extension_points] File "/home/ramon/anaconda3/lib/python3.7/site-packages/jupyter_server/extension/manager.py", line 229, in load_point return point.load(serverapp) File "/home/ramon/anaconda3/lib/python3.7/site-packages/jupyter_server/extension/manager.py", line 155, in load return loader(serverapp) File "/home/ramon/anaconda3/lib/python3.7/site-packages/jupyterlab_spellchecker/init.py", line 34, in _load_jupyter_server_extension setup_handlers(server_app.web_app, url_path, server_app) File "/home/ramon/anaconda3/lib/python3.7/site-packages/jupyterlab_spellchecker/handlers.py", line 27, in setup_handlers dictionaries = discover_dictionaries(server_app) File "/home/ramon/anaconda3/lib/python3.7/site-packages/jupyterlab_spellchecker/dictionaries.py", line 124, in discover_dictionaries dictionaries.extend(_scan_for_dictionaries(path, server_app.log)) File "/home/ramon/anaconda3/lib/python3.7/site-packages/jupyterlab_spellchecker/dictionaries.py", line 89, in _scan_for_dictionaries display_name = locale_data.get_display_name() File "/home/ramon/anaconda3/lib/python3.7/site-packages/babel/core.py", line 381, in get_display_name retval = locale.languages.get(self.language) File "/home/ramon/anaconda3/lib/python3.7/site-packages/babel/core.py", line 482, in languages return self._data['languages'] File "/home/ramon/anaconda3/lib/python3.7/site-packages/babel/core.py", line 364, in _data self.__data = localedata.LocaleDataDict(localedata.load(str(self))) File "/home/ramon/anaconda3/lib/python3.7/site-packages/babel/localedata.py", line 143, in load with open(filename, 'rb') as fileobj: FileNotFoundError: [Errno 2] No such file or directory: '/home/ramon/anaconda3/lib/python3.7/site-packages/babel/locale-data/ca_ES_valencia.dat'

[W 2021-10-08 10:52:02.237 ServerApp] jupyterlab_spellchecker | extension failed loading with message: [Errno 2] No such file or directory: '/home/ramon/anaconda3/lib/python3.7/site-packages/babel/locale-data/ca_ES_valencia.dat'

krassowski commented 2 years ago
  1. The entry in the file menu is not spellchecker! It's the User Interface language, which is not related to spellecheker at all. You can get UI language pack from https://github.com/jupyterlab/jupyterlab-language-packs, more precisely from https://github.com/jupyterlab/language-packs/tree/master/language-packs/jupyterlab-language-pack-ca-ES and you can contribute to translations on crowdin: https://crowdin.com/project/jupyterlab; the spellchecker language is set from status bar but it likely does not show up for you due to the other issue
  2. extension failed loading with message: [Errno 2] No such file or directory: '[...]/babel/locale-data/ca_ES_valencia.dat tells us that spellchecker failed to load as babel could not resolve locale data for ca_ES_valencia; this means that you have a dictionary for ca_ES_valencia but there is no corresponding locale in international codes available from babel; please note that these are not dictionaries - this is just metadata about locale.
  3. This means that you have a dictionary in one of the following paths:
    • /home/ramon/.local/share/jupyter/dictionaries
    • /home/ramon/anaconda3/share/jupyter/dictionaries
    • /usr/local/share/jupyter/dictionaries
    • /usr/share/jupyter/dictionaries
    • /usr/share/hunspell
    • /usr/share/myspell
    • /usr/share/myspell/dicts which is named ca_ES_valencia.aff and ca_ES_valencia.dic; however this is not a locale code recognised by babel; I checked what codes are available, and those include ca_ES and ca_ES_VALENCIA. This means that the name of the region code should be uppercase, but your dictionaries seem to have it lowercase; I don't know if this is a bug in babel, or a problem with your dictionary; please try locating the dictionary files and renaming them to the lettercase accepted by babel.
krassowski commented 2 years ago

The BCP 47 standard recommends that lettercase is discarded, so it looks like a bug to me. Reproducer:

>>> import babel
>>> d = babel.Locale.parse('ca_ES_VALENCIA')
>>> d.get_display_name()
'català (Espanya, valencià)'
>>> import babel
>>> d = babel.Locale.parse('ca_ES_valencia')
>>> d.get_display_name()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "[..]/site-packages/babel/core.py", line 381, in get_display_name
    retval = locale.languages.get(self.language)
  File "[..]/site-packages/babel/core.py", line 482, in languages
    return self._data['languages']
  File "[..]/site-packages/babel/core.py", line 364, in _data
    self.__data = localedata.LocaleDataDict(localedata.load(str(self)))
  File "[..]/site-packages/babel/localedata.py", line 143, in load
    with open(filename, 'rb') as fileobj:
FileNotFoundError: [Errno 2] No such file or directory: '[..]/site-packages/babel/locale-data/ca_ES_valencia.dat'

In short, it's not us, it's a babel bug; they should find the correct file, or be throwing babel.core.UnknownLocaleError if anything, but they are throwing FileNotFoundError which we did not expect.

krassowski commented 2 years ago

I opened https://github.com/python-babel/babel/issues/814.

krassowski commented 2 years ago

Hi @rcrehuet an updated version of jupyterlab-spellchecker (0.7.2) is now available on conda-forge (https://anaconda.org/conda-forge/jupyterlab-spellchecker), and it works around the bug in babel by catching the exception; I think that it should solve the problem for you (although you will see the dictionary name without pretty display name). Please upgrade with:

conda install -c conda-forge jupyterlab-spellchecker=0.7.2

and after restarting JuypterLab please let me know if you see a spellchecker item in the status bar; it should look somewhat like on the GIF below:

languages-opt

rcrehuet commented 2 years ago

I'll first answer your previous message. The files ca_ES_valencia.* are indeed located in /usr/share/hunspell/. They are put there by the hunspell-ca package of the Ubuntu distribution.

Concerning the new version, it works flawlessly! Thanks a lot!

krassowski commented 2 years ago

Thanks for confirming that it works well! I will close this issue now as resolved.

For anyone interested in making the display name pretty, the way forward would be to contribute upstream to fix https://github.com/python-babel/babel/issues/814 in babel; if the babel maintainers come back saying that they cannot accept a fix (which I don't think will happen, but just in case) then we will workaround it here.