mortii / anki-morphs

A MorphMan fork rebuilt from the ground up with a focus on simplicity, performance, and a codebase with minimal technical debt.
https://mortii.github.io/anki-morphs/
Mozilla Public License 2.0
55 stars 8 forks source link

Add spaCy morphemizers #62

Closed mortii closed 8 months ago

mortii commented 10 months ago

Discussed in https://github.com/mortii/anki-morphs/discussions/13

Originally posted by **Vilhelm-Ian** September 23, 2023 For languages without spaces like german there is no dictionary So K and V are same. So morphman gives me the same word different endings. It would be nice if morphman used Spacy
Vilhelm-Ian commented 10 months ago

Doing this with spacy is easy all you have to do is token.lemma to get the lemma.

The hard part that I can get around is packaging spacy because of C++ dependcies like numpy to work.

For the models themselves we can go around two ways.

  1. Have the user download the model and drag and drop the folder to the addon folder and then from the addon to select the model.

  2. Or do something like this from the addon subprocess.Popen('pip install -t ./path model"', shell=True) Here is someone who tried to make it to seperate addon: https://github.com/rteabeault/AnkiSpacy

mortii commented 10 months ago

OMG! That addon looks perfect for our use-case (at least at first glance). That could potentially take care of a lot of work for us. We should probably try to leverage that add-on first, if that doesn't work then maybe try something like your 1st suggestion?

Do you want to experiment with this @Vilhelm-Ian ? There are things I want to do first before I really start working on this, and if you already have some trial-and-error experience then we could potentially get it working faster.

mortii commented 10 months ago

Or maybe first just use any way possible to import spaCy to even test if it produces our desired output, then we can worry about the importing later?

Vilhelm-Ian commented 10 months ago

Offcourse I was just throwing out ideas how it could be done

mortii commented 10 months ago

Brainstorming is great. We should probably start using #13 for any further discussion and trial-and-error results. This issue can just be a placeholder for the project roadmap view.

mortii commented 8 months ago

released in v0.11