Closed itscz-org closed 1 year ago
The sole purpose of this error message is to make it clear that you need these three dependencies and not just tltk. No digging into the sourcecode should be required just reading the (hopefully) fine manual.
Did you install them as explained in INSTALL.md?
Sure i did.
pip install pykakasi
Requirement already satisfied: pykakasi in /usr/local/lib/python3.9/dist-packages (2.2.1)
Requirement already satisfied: deprecated in /usr/local/lib/python3.9/dist-packages (from pykakasi) (1.2.13)
Requirement already satisfied: jaconv in /usr/local/lib/python3.9/dist-packages (from pykakasi) (0.3)
Requirement already satisfied: wrapt<2,>=1.10 in /usr/local/lib/python3.9/dist-packages (from deprecated->pykakasi) (1.14.1)
pip install pinyin_jyutping_sentence
Requirement already satisfied: pinyin_jyutping_sentence in /usr/local/lib/python3.9/dist-packages (1.3)
Requirement already satisfied: jieba in /usr/local/lib/python3.9/dist-packages (from pinyin_jyutping_sentence) (0.42.1)
I digged the code to find out its the tltk case that triggers the exception.
OK, than neither tltk nor the other two are likely an issue.
Why are you trying to run geo-transcript-srv.py manually anyway? It should get started automatically after installing the debian package.
See systemctl status osml10n
The above command does work fine here after running systemctl stop osml10n
:
/usr/bin/geo-transcript-srv.py -s -g /usr/share/osml10n/boundaries
Loading osml10n transcription server: ready.
I started investigation because it failed to (auto) start:
● osml10n.service - OSM l10n transcription server
Loaded: loaded (/lib/systemd/system/osml10n.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Tue 2022-11-15 13:52:16 CET; 51min ago
Process: 228367 ExecStart=/usr/bin/geo-transcript-srv.py -s -g /usr/share/osml10n/boundaries (code=exited, status=1/FAILURE)
Main PID: 228367 (code=exited, status=1/FAILURE)
CPU: 1.126s
Nov 15 13:52:16 xxx systemd[1]: osml10n.service: Scheduled restart job, restart counter is at 5.
Nov 15 13:52:16 xxx systemd[1]: Stopped OSM l10n transcription server.
Nov 15 13:52:16 xxx systemd[1]: osml10n.service: Consumed 1.126s CPU time.
Nov 15 13:52:16 xxx systemd[1]: osml10n.service: Start request repeated too quickly.
Nov 15 13:52:16 xxx systemd[1]: osml10n.service: Failed with result 'exit-code'.
Nov 15 13:52:16 xxx systemd[1]: Failed to start OSM l10n transcription server.
OK so your python modules seem to differ :(
Does this look different for you?
osml10n/ (master) > python
Python 3.9.2 (default, Feb 28 2021, 17:03:44)
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tltk
>>> import pykakasi
>>> import pinyin_jyutping_sentence
Building prefix dict from /usr/local/lib/python3.9/dist-packages/pinyin_jyutping_sentence/dict.txt.big ...
Loading model from cache /tmp/jieba.udae52a0cdc3624438ee23d21e0736dec.cache
Dumping model to file cache /tmp/jieba.udae52a0cdc3624438ee23d21e0736dec.cache
Dump cache file failed.
Traceback (most recent call last):
File "/usr/local/lib/python3.9/dist-packages/jieba/__init__.py", line 154, in initialize
_replace_file(fpath, cache_file)
PermissionError: [Errno 1] Operation not permitted: '/tmp/tmpbm3_5frg' -> '/tmp/jieba.udae52a0cdc3624438ee23d21e0736dec.cache'
Loading model cost 2.333 seconds.
Prefix dict has been built successfully.
>>>
Python 3.9.2 (default, Feb 28 2021, 17:03:44)
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tltk
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.9/dist-packages/tltk/__init__.py", line 1, in <module>
from tltk import nlp
File "/usr/local/lib/python3.9/dist-packages/tltk/nlp.py", line 31, in <module>
from sklearn.ensemble import RandomForestClassifier
ModuleNotFoundError: No module named 'sklearn'
>>> import pykakasi
>>> import pinyin_jyutping_sentence
Hm, this is what it looks like here:
~/ # pip install tltk
Requirement already satisfied: tltk in /usr/local/lib/python3.9/dist-packages (1.6)
Requirement already satisfied: sklearn-crfsuite in /usr/local/lib/python3.9/dist-packages (from tltk) (0.3.6)
Requirement already satisfied: sklearn in /usr/local/lib/python3.9/dist-packages (from tltk) (0.0.post1)
Requirement already satisfied: nltk in /usr/local/lib/python3.9/dist-packages (from tltk) (3.7)
Requirement already satisfied: gensim in /usr/local/lib/python3.9/dist-packages (from tltk) (4.1.2)
Requirement already satisfied: scipy>=0.18.1 in /usr/lib/python3/dist-packages (from gensim->tltk) (1.6.0)
Requirement already satisfied: smart-open>=1.8.1 in /usr/local/lib/python3.9/dist-packages (from gensim->tltk) (5.2.1)
Requirement already satisfied: numpy>=1.17.0 in /usr/lib/python3/dist-packages (from gensim->tltk) (1.19.5)
Requirement already satisfied: tqdm in /usr/local/lib/python3.9/dist-packages (from nltk->tltk) (4.62.3)
Requirement already satisfied: regex>=2021.8.3 in /usr/local/lib/python3.9/dist-packages (from nltk->tltk) (2022.1.18)
Requirement already satisfied: click in /usr/lib/python3/dist-packages (from nltk->tltk) (7.1.2)
Requirement already satisfied: joblib in /usr/local/lib/python3.9/dist-packages (from nltk->tltk) (1.1.0)
Requirement already satisfied: tabulate in /usr/local/lib/python3.9/dist-packages (from sklearn-crfsuite->tltk) (0.8.9)
Requirement already satisfied: python-crfsuite>=0.8.3 in /usr/local/lib/python3.9/dist-packages (from sklearn-crfsuite->tltk) (0.9.7)
Requirement already satisfied: six in /usr/lib/python3/dist-packages (from sklearn-crfsuite->tltk) (1.16.0)
This one fixed it:
python3 -m pip install scikit-learn
Server is now running. Thanks anyway.
Hm looks like a broken dependency in tltk then. Maybe we should report it.
Additional Info for this: In Ubuntu 22.04.1 LTS you will get the same error:
Loading osml10n transcription server:
ERROR: unable to load required python modules, please install them as follows:
pip install pykakasi
pip install tltk
pip install pinyin_jyutping_sentence
Trying to import tltk in python3 you will get:
Python 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tltk
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.10/dist-packages/tltk/__init__.py", line 2, in <module>
from tltk import corpus
File "/usr/local/lib/python3.10/dist-packages/tltk/corpus.py", line 23, in <module>
import gensim
File "/usr/local/lib/python3.10/dist-packages/gensim/__init__.py", line 11, in <module>
from gensim import parsing, corpora, matutils, interfaces, models, similarities, utils # noqa:F401
File "/usr/local/lib/python3.10/dist-packages/gensim/corpora/__init__.py", line 6, in <module>
from .indexedcorpus import IndexedCorpus # noqa:F401 must appear before the other classes
File "/usr/local/lib/python3.10/dist-packages/gensim/corpora/indexedcorpus.py", line 14, in <module>
from gensim import interfaces, utils
File "/usr/local/lib/python3.10/dist-packages/gensim/interfaces.py", line 19, in <module>
from gensim import utils, matutils
File "/usr/local/lib/python3.10/dist-packages/gensim/matutils.py", line 1031, in <module>
from gensim._matutils import logsumexp, mean_absolute_difference, dirichlet_expectation
File "gensim/_matutils.pyx", line 1, in init gensim._matutils
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
And this can be fixed by executing this after installing tltk:
pip install --upgrade numpy
Resulting in:
87 tests passed, 0 tests failed.
Maybe an info for INSTALL.md?
I do not hope that this bug will persist. I will reopen this issue until the tltk package has been fixed.
Added a comment about python libraries
Just installed on Debian Bullseye 11.5.
I figured out by comments in the script this is because of the tltk lib, but it is installed:
Any suggestions?