Closed mortii closed 7 months ago
I am really excited about this! I can't believe how fast you are able to add new features to Ankimorphs.
It should be identical to morphman, but I'm not able to test it extensively because:
So any feedback would be welcome!
I've done some spot checking, and I can confirm that the morphs identified are the same ones that Morphman finds. Before doing anything, I deleted my ankimorphs.db to ensure that everything it was populating was new.
But Ankimorphs says that I know a lot more morphs than Morphman does. Trying to track this down, I have a card with the text: 我没想瞒着任何人的
am-highlighted looks like this:
<span morph-status="known">我</span><span morph-status="known">没</span><span morph-status="known">想</span><span morph-status="unknown">瞒</span><span morph-status="known">着</span><span morph-status="known">任何人</span><span morph-status="known">的</span>
The word 任何人
shows up in Ankimorphs as known, but it is unknown with Morphman. I can confirm with an Anki search that there are no cards with that morph that are not new.
I then changed back to spaCy. In order to get am-highlighted to update, I had to delete the ankimorphs.db again. Here is what it looks like with spaCy:
<span morph-status="known">我</span><span morph-status="known">没</span><span morph-status="known">想</span><span morph-status="unknown">瞒</span><span morph-status="known">着</span><span morph-status="known">任何</span><span morph-status="known">人</span><span morph-status="known">的</span>
As you can see, spaCy has it separated into two morphs. Both of those individual morphs are on known cards: 遇到任何困难
and just 人
. I'm not sure if it is coincidence, or if there is some remaining pointer to spaCy or there is another database that I need to delete?
The word 任何人 shows up in Ankimorphs as known, but it is unknown with Morphman. I can confirm with an Anki search that there are no cards with that morph that are not new.
I'm very tired, so parsing multiple negations is hard at the moment, but you are saying that AnkiMorphs: Chinese
correctly identifies 任何人
as known, but MorphMan does not?
I'm not sure if it is coincidence, or if there is some remaining pointer to spaCy or there is another database that I need to delete?
No, only ankimorphs.db
. However, speaking from experience, switching between morphemizers can lead to some cards being marked as known with one morphemizer, and the known tag sticks even if you switch to another morphemizer, so you can get a weird pattern where previously unknown cards/morphs become known. Removing all am-known-automatically
tags from all cards should be safe, and should hopefully fix that kind of problem.
I'm very tired, so parsing multiple negations is hard at the moment, but you are saying that AnkiMorphs: Chinese correctly identifies 任何人 as known, but MorphMan does not?
No, it should have been unknown but AnkiMorphs said it was known.
However, speaking from experience, switching between morphemizers can lead to some cards being marked as known with one morphemizer, and the known tag sticks even if you switch to another morphemizer, so you can get a weird pattern where previously unknown cards/morphs become known
This was the problem. Removing the tags fixed the problem. Thanks!
No, it should have been unknown but AnkiMorphs said it was known.
If you can send me these files from your anki profile folder:
.apkg
file)then I can help debug this if you want.
Sorry I wasn't clear. Removing all of the tags & doing a recalc fixed all my issues. Thanks!
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Originally posted by @xofm31 in https://github.com/mortii/anki-morphs/issues/174#issuecomment-1983442509
Originally posted by @mortii in https://github.com/mortii/anki-morphs/issues/174#issuecomment-1983640570
Todo list:
.addon
filemorphemizer.py
jieba_wrapper.py