Open nlovell1 opened 3 years ago
Thanks so much for taking a look at this @thinkingbox12. I am currently trying to reproduce but it seems that on the latest version of Windows there is an issue with one of Spacy's dependencies numpy
that I need to work past.
The current Numpy installation fails to pass a sanity check due to a bug in the windows runtime. See this issue for more information: https://tinyurl.com/y3dm3h86
This appears to be an issue that is in the pipeline of being fixed by Microsoft. If I can work around it I will try to reproduce your issue. At first glance the error you are getting appears to be something missing on your Windows machine that is needed by sudachi
ImportError: DLL load failed while importing _dartsclone: The specified module could not be found.
I will report back as soon as I figure something out.
Had a little bit of time to get you the error I was having on Ubuntu when trying to install Japanese (large) model. Let me know if theres anything else I can do.
running build_ext
cythoning sudachipy/latticenode.pyx to sudachipy/latticenode.c
/home/.local/share/Anki2/addons21/src/user_files/packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-install-1ydi_wzr/sudachipy_07a9ba150165499b926687bb1b596868/sudachipy/latticenode.pxd
tree = Parsing.p_module(s, pxd, full_module_name)
cythoning sudachipy/lattice.pyx to sudachipy/lattice.c
/home/.local/share/Anki2/addons21/src/user_files/packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-install-1ydi_wzr/sudachipy_07a9ba150165499b926687bb1b596868/sudachipy/lattice.pxd
tree = Parsing.p_module(s, pxd, full_module_name)
cythoning sudachipy/tokenizer.pyx to sudachipy/tokenizer.c
/home/.local/share/Anki2/addons21/src/user_files/packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-install-1ydi_wzr/sudachipy_07a9ba150165499b926687bb1b596868/sudachipy/tokenizer.pyx
tree = Parsing.p_module(s, pxd, full_module_name)
building 'sudachipy.latticenode' extension
creating build/temp.linux-x86_64-3.8
creating build/temp.linux-x86_64-3.8/sudachipy
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/usr/local/share/anki/bin/include/python3.8 -c sudachipy/latticenode.c -o build/temp.linux-x86_64-3.8/sudachipy/latticenode.o
[31mERROR: Exception:
Traceback (most recent call last):
File "/home/.local/share/Anki2/addons21/src/_vendor/distutils/unixccompiler.py", line 117, in _compile
self.spawn(compiler_so + cc_args + [src, '-o', obj] +
File "distutils/ccompiler.py", line 910, in spawn
File "distutils/spawn.py", line 36, in spawn
File "distutils/spawn.py", line 157, in _spawn_posix
distutils.errors.DistutilsExecError: command 'gcc' failed with exit status 1
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/.local/share/Anki2/addons21/src/_vendor/distutils/core.py", line 148, in setup
dist.run_commands()
File "/home/.local/share/Anki2/addons21/src/_vendor/distutils/dist.py", line 966, in run_commands
self.run_command(cmd)
File "/home/.local/share/Anki2/addons21/src/_vendor/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/home/.local/share/Anki2/addons21/src/_vendor/setuptools/command/install.py", line 61, in run
return orig.install.run(self)
File "/home/.local/share/Anki2/addons21/src/_vendor/distutils/command/install.py", line 545, in run
self.run_command('build')
File "/home/.local/share/Anki2/addons21/src/_vendor/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/home/.local/share/Anki2/addons21/src/_vendor/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/home/.local/share/Anki2/addons21/src/_vendor/distutils/command/build.py", line 135, in run
self.run_command(cmd_name)
File "/home/local/share/Anki2/addons21/src/_vendor/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/home/.local/share/Anki2/addons21/src/_vendor/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/home/.local/share/Anki2/addons21/src/_vendor/setuptools/command/build_ext.py", line 79, in run
_build_ext.run(self)
File "/home/.local/share/Anki2/addons21/src/user_files/packages/Cython/Distutils/old_build_ext.py", line 186, in run
_build_ext.build_ext.run(self)
File "/home/.local/share/Anki2/addons21/src/_vendor/distutils/command/build_ext.py", line 340, in run
self.build_extensions()
File "/home/.local/share/Anki2/addons21/src/user_files/packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions
_build_ext.build_ext.build_extensions(self)
File "/home/.local/share/Anki2/addons21/src/_vendor/distutils/command/build_ext.py", line 449, in build_extensions
self._build_extensions_serial()
File "/home/.local/share/Anki2/addons21/src/_vendor/distutils/command/build_ext.py", line 474, in _build_extensions_serial
self.build_extension(ext)
File "/home/local/share/Anki2/addons21/src/_vendor/setuptools/command/build_ext.py", line 196, in build_extension
_build_ext.build_extension(self, ext)
File "/home/.local/share/Anki2/addons21/src/_vendor/distutils/command/build_ext.py", line 528, in build_extension
objects = self.compiler.compile(sources,
File "distutils/ccompiler.py", line 574, in compile
File "/home/.local/share/Anki2/addons21/src/_vendor/distutils/unixccompiler.py", line 120, in _compile
raise CompileError(msg)
distutils.errors.CompileError: command 'gcc' failed with exit status 1
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/local/share/Anki2/addons21/src/_vendor/pip/_internal/cli/base_command.py", line 224, in _main
status = self.run(options, args)
File "/home/.local/share/Anki2/addons21/src/_vendor/pip/_internal/cli/req_command.py", line 180, in wrapper
return func(self, options, args)
File "/home/.local/share/Anki2/addons21/src/_vendor/pip/_internal/commands/install.py", line 394, in run
installed = install_given_reqs(
File "/home/.local/share/Anki2/addons21/src/_vendor/pip/_internal/req/__init__.py", line 82, in install_given_reqs
requirement.install(
File "/home/.local/share/Anki2/addons21/src/_vendor/pip/_internal/req/req_install.py", line 840, in install
success = install_legacy(
File "/home/.local/share/Anki2/addons21/src/_vendor/pip/_internal/operations/install/legacy.py", line 95, in install
exec(theargs, globals(), globals())
File "<string>", line 1, in <module>
File "/tmp/pip-install-1ydi_wzr/sudachipy_07a9ba150165499b926687bb1b596868/setup.py", line 25, in <module>
setup(name="SudachiPy",
File "/home/.local/share/Anki2/addons21/src/_vendor/setuptools/__init__.py", line 153, in setup
return distutils.core.setup(**attrs)
File "/home/.local/share/Anki2/addons21/src/_vendor/distutils/core.py", line 163, in setup
raise SystemExit("error: " + str(msg))
SystemExit: error: command 'gcc' failed with exit status 1[0m
Oh that looks fun. So gcc is failing to compile sudachi on linux. Both of these issues seem to not necessarily be problems with this Anki addon but with OS related issues. It would be good to figure out how to workaround them though and document.
I'd love to help. Where do I begin? Can also test on a Mac tomorrow.
Ok. I have seen three problems so far and here are the workarounds
Windows only: There is an issue with spacy's dependency numpy
and windows. After installing a spacy model on windows (if you have the windows October Update 2004
) you may see
The current Numpy installation ('<some_path_to_numpy_init_file>') fails to pass a sanity check due to a bug in the windows runtime. See this issue for more information: https://tinyurl.com/y3dm3h86
To fix this open some_path_to_numpy_init_file
seen in the error. Replace the line
if sys.platform == "win32" and sys.maxsize > 2**32:
_win_os_check()
with
if sys.platform == "win32" and sys.maxsize > 2**32:
# _win_os_check()
pass
Once https://developercommunity.visualstudio.com/content/problem/1207405/fmod-after-an-update-to-windows-2004-is-causing-a.html is resolved this step should be unnecessary.
Windows Only: You may see the error
ImportError: Japanese support requires SudachiPy and SudachiDict-core (https://github.com/WorksApplications/SudachiPy). Install with `pip install sudachipy sudachidict_core` or install spaCy with `pip install spacy[ja]`.
And further up in the error message you see
ImportError: DLL load failed while importing _dartsclone: The specified module could not be found.
This can be fixed by installing the visual c++ redistributable.
Windows only: When running Morphman recalc you see the following error
OSError: [WinError 1314] A required privilege is not held by the client: 'C:\\workspace\\AnkiSpacy\\src\\user_files\\packages\\sudachidict_core' -> 'C:\\workspace\\AnkiSpacy\\src\\user_files\\packages\\sudachidict'
Further up in the error you will also see
File "C:\workspace\AnkiSpacy\src\user_files\packages\sudachipy\config.py", line 56, in create_default_link_for_sudachidict_core
dict_path = Path(import_module('sudachidict').__file__).parent
File "importlib\__init__.py", line 127, in import_module
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 973, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'sudachidict'
This seems to be sudachi trying to create a symlink to the installed sudachidict (in this case, by default it is sudachidict_core). I thought restarting Anki running as administrator would fix the problem but Morphman then disappeared from my Anki tools. I will try to figure that out tomorrow. In the meantime you can create a symlink by
mklink /D <anki_spacy_addon_path>packages\sudachidict <anki_spacy_addon_path>\packages\sudachidict_core
Tomorrow I will look into the linux compiler issue.
I'd love to help. Where do I begin? Can also test on a Mac tomorrow.
See if you can get the Japanese working with the workarounds here. After that whatever things you can test out are appreciated. Chinese may be good because I believe it also requires extra 3rd party dependencies like Japanese.
Did not mean to close this. Reopen.
Japanese small model (windows) worked after these fixes (install and recalc). Went to install medium model right after. Got this exception
Traceback (most recent call last):
File "C:\Users\AppData\Roaming\Anki2\addons21\src\_vendor\pip\_internal\cli\base_command.py", line 224, in _main
status = self.run(options, args)
File "C:\Users\AppData\Roaming\Anki2\addons21\src\_vendor\pip\_internal\cli\req_command.py", line 180, in wrapper
return func(self, options, args)
File "C:\Users\AppData\Roaming\Anki2\addons21\src\_vendor\pip\_internal\commands\install.py", line 452, in run
self._handle_target_dir(
File "C:\Users\AppData\Roaming\Anki2\addons21\src\_vendor\pip\_internal\commands\install.py", line 505, in _handle_target_dir
shutil.rmtree(target_item_dir)
File "shutil.py", line 730, in rmtree
File "shutil.py", line 608, in _rmtree_unsafe
File "shutil.py", line 606, in _rmtree_unsafe
PermissionError: [WinError 5] Access is denied: 'C:\\Users\\AppData\\Roaming\\Anki2\\addons21\\src\\user_files\\packages\\dartsclone\\_dartsclone.cp38-win_amd64.pyd'[0m
I noticed it reinstalls all the packages for each model (cython, sortedcontainers, dartsclone, sudachipy). Just something to note
An immediate recalc (large model) resulted in this error:
Anki 2.1.35 (84dcaa86) Python 3.8.0 Qt 5.14.2 PyQt 5.14.2
Platform: Windows 10
Flags: frz=True ao=True sv=1
Add-ons, last update check: 2021-01-04 22:26:33
Caught exception:
Traceback (most recent call last):
File "C:\Users\AppData\Roaming\Anki2\addons21\MorphMan\__init__.py", line 20, in onMorphManRecalc
main.main()
File "C:\Users\AppData\Roaming\Anki2\addons21\MorphMan\morph\main.py", line 573, in main
allDb = mkAllDb(cur)
File "C:\Users\AppData\Roaming\Anki2\addons21\MorphMan\morph\main.py", line 195, in mkAllDb
ms = getMorphemes(morphemizer, fieldValue, ts)
File "C:\Users\AppData\Roaming\Anki2\addons21\MorphMan\morph\morphemes.py", line 166, in getMorphemes
ms = morphemizer.getMorphemesFromExpr(expression)
File "C:\Users\AppData\Roaming\Anki2\addons21\MorphMan\morph\morphemizer.py", line 52, in getMorphemesFromExpr
morphs = self._getMorphemesFromExpr(expression)
File "C:\Users\AppData\Roaming\Anki2\addons21\MorphMan\morph\deps\spacy\morphemizer.py", line 21, in _getMorphemesFromExpr
self.nlp = spacy.load(self.model_path)
File "C:\Users\AppData\Roaming\Anki2\addons21\src\user_files\packages\spacy\__init__.py", line 30, in load
return util.load_model(name, **overrides)
File "C:\Users\AppData\Roaming\Anki2\addons21\src\user_files\packages\spacy\util.py", line 172, in load_model
return load_model_from_path(Path(name), **overrides)
File "C:\Users\AppData\Roaming\Anki2\addons21\src\user_files\packages\spacy\util.py", line 220, in load_model_from_path
component = nlp.create_pipe(factory, config=config)
File "C:\Users\AppData\Roaming\Anki2\addons21\src\user_files\packages\spacy\language.py", line 310, in create_pipe
raise KeyError(Errors.E002.format(name=name))
KeyError: "[E002] Can't find factory for 'parser'. This usually happens when spaCy calls `nlp.create_pipe` with a component name that's not built in - for example, when constructing the pipeline from a model's meta.json. If you're using a custom component, you can write to `Language.factories['parser']` or remove it from the model meta and add it via `nlp.add_pipe` instead."
However, after a restart of Anki, I could again recalc using large model. Why is this behaviour occuring?
PermissionError: [WinError 5] Access is denied: 'C:\Users\AppData\Roaming\Anki2\addons21\src\user_files\packages\dartsclone\_dartsclone.cp38-win_amd64.pyd'�[0m
Do you get this every time you repeat the same steps? I am wondering if it is similar to the permissions problem for symlinks. If you can reproduce this can you enable Developer Mode
in windows and see if that fixes it.
Settings -> Update & Security -> For Developers -> Developer Mode
I noticed it reinstalls all the packages for each model (cython, sortedcontainers, dartsclone, sudachipy). Just something to note
This is known. The code to install via pip uses the -t
and --upgrade
options. In this case it always tries to install the packages even if they are already there. It is unfortunate but I could not find a better way to do it.
KeyError: "[E002] Can't find factory for 'parser'. This usually happens when spaCy calls
nlp.create_pipe
with a component name that's not built in - for example, when constructing the pipeline from a model's meta.json. If you're using a custom component, you can write toLanguage.factories['parser']
or remove it from the model meta and add it vianlp.add_pipe
instead."
I am not sure what is happening here. I will try to reproduce this.
PermissionError: [WinError 5] Access is denied: 'C:\Users\AppData\Roaming\Anki2\addons21\src\user_files\packages\dartsclone_dartsclone.cp38-win_amd64.pyd'
I have reproduced this. If you install ja_core_news_md and recalc and then try to install any other model that has a dependency on dartsclone you will see this error. I believe it is because the file listed in the error is still being held open by the anki process. If you restart anki and try to install it again you won't see this error. I believe though that even when you see the error the model has properly been installed. You just need to restart. This is a byproduct of what you astutely observed that dependencies are reinstalled for each model. It is unfortunate. I will try to think of a fix but at this point I am not sure how to resolve this without restarting anki.
I continue to run into problems. All around Chinese and Japanese and most in Windows. For example, it becomes impossible to remove some models or spacy if Morphman has already loaded the packages. This is because Windows will not allow you to delete files in use and anki has already loaded some DLLs from the packages. My intent all along was to create an easy to use package manager that did not require the user to install python. But trying to install packages and load them while anki is already running is turning out to be a large pain. I am partially considering ditching the GUI package manager and instead just giving users instructions on how to install packages via pip.
I don't think it would be all that bad. I would imagine the users that would benefit from the more precise Japanese parsing wouldn't mind running cmd.
I'd like to start investigating the other issue I posted. I am a beginner to troubleshooting/testing, what exactly does setting up a dev environment consist of and how is it different from just running Anki as a normal user?
The simplest thing to do is to copy the contents of <this_repo>/src to <your_anki_home>/addons21/AnkiSpacy
. You can then just modify the code and run anki as normal to test. Alternatively you can symlink from your addons21
directory to wherever src is. Let me know if you have any questions.
Okay. I think I've centered in on a problem I was having. For whatever reason, when using any Spacy-based morphemizer when recalculating cards (after generating a frequency list from a study plan in the Readability Analyzer), it is not reading from the frequency.txt file, and therefore no cards are tagged with the 'mm_FrequencyList' tag. It seems that this error occurs regardless of the morphemizer that I use, as it fails with both MeCab and Spacy. Moreover, Mecab has no trouble reading from a frequency.txt generated by Spacy. So I've no idea for what reason this is occurring, because if Spacy was broken it wouldn't recalc properly at all, it just seems there's a bug with reading frequency.txt and tagging cards/ reordering their priority as such. What do you think?
Ok. It seems the problem is, when using Spacy, the master_freq counter in frequency.txt is not updating. Still unsure as to why.
Update #2. The new models (seem?) to be working when the frequency.txt is generated with the readability tool WITH THE MODEL that you want to read it with. This means previously generated frequency lists (typically made with MeCab) are incompatible. I'm not sure why this is. Probably something to do with how spacy lists results/classifies parses, but I haven't figured it out further.
Despite this, morphman does not tag whether a card is on your frequency list or not.
Additionally, it feels that the cards aren't adhering to the study plan as well as I might have remembered (this might totally be me though, I wasn't particularly familiar with the scoring algorithm before I looked at the code, so maybe it's working the same way as before).
See this reply: https://github.com/kaegi/MorphMan/pull/221#issuecomment-754379723