mortii / anki-morphs

A MorphMan fork rebuilt from the ground up with a focus on simplicity, performance, and a codebase with minimal technical debt.
https://mortii.github.io/anki-morphs/
Mozilla Public License 2.0
52 stars 7 forks source link

Spacy to treat cards with an already seen base as known #120

Closed Vilhelm-Ian closed 7 months ago

Vilhelm-Ian commented 8 months ago

image This is the current card shown as new

This is the card's morphs image

image

Here you can see that I already have a card that is already been learned and has the same morph.

In the readability-report-generator you mentioned why it currently acts like that. But the issue I have is that I see way too often cards with target morph that I already know.

Could there be a toggle for this.

mortii commented 8 months ago

I use 'set known and skip' for situations like this. Is that not doable here?

Vilhelm-Ian commented 8 months ago

I just press k when it happens. But it happens too often

mortii commented 8 months ago

How often is too often?

I think that would be the ultimate foot shooting option to be honest. It would work well for 'book' and 'books', but then you would completely miss 'gone' and 'went', which would be a disaster.

Vilhelm-Ian commented 8 months ago

Multiple times every single day. And that's not the only issue I am missing out on a bunch of cards that have multiple morphs with a base form I already have seen.

mortii commented 8 months ago

And that's not the only issue I am missing out on a bunch of cards that have multiple morphs with a base form I already have seen

@Vilhelm-Ian I'm not sure what you mean by that, could you give an example?

Vilhelm-Ian commented 8 months ago
  1. First issue I see often words I already know
  2. Second issue I won't see words often enough. For example let's say I know the words cat, mouse, but don't know the word eat. the sentence "Cats often eat mice" I won't see in a very long time because it contains two words which I haven't encoutered yet but I do have encoutered their base form.

German I have immeresed a lot and have gone through a lot of cards using anki-morph/man . I am at a point where I have seen most lemmatized forms of words I have already seen in anki.

mortii commented 7 months ago

For example let's say I know the words cat, mouse, but don't know the word eat. the sentence "Cats often eat mice" I won't see in a very long time because it contains two words which I haven't encoutered yet but I do have encoutered their base form.

Perfect example, thank you.

German I have immeresed a lot and have gone through a lot of cards using anki-morph/man . I am at a point where I have seen most lemmatized forms of words I have already seen in anki.

So you are basically saying that you know German grammar well enough that seeing lemmas has just become noise, and you would rather see new words instead?

An 'ignore inflections' option would be very hard to implement I think, because it would require having two different versions of the algorithm that we would switch between during runtime. Such an undertaking would only be worth it if the current algorithm is completely unusable in my opinion.

I'm not sure I can come up with a scenario where the current algorithm would be unusable; if you are trying to learn 10 new cards a day and you have to press 'K' 100 times before you achieve that because you already understand the cards you see, I still would not consider that unusable--you would just have to shift your mindset from: "it's annoying to see so many words I already understand" to "I'm understanding so many words, I'm great at this language!"

thoughts @Vilhelm-Ian ?

Vilhelm-Ian commented 7 months ago

So you are basically saying that you know German grammar well enough that seeing lemmas has just become noise, and you would rather see new words instead?

Yes

An 'ignore inflections' option would be very hard to implement

Not saying that this is the solution I am just asking if I can use it as a starting point for my investigation.

def _get_morph_collection_priority(am_db: AnkiMorphsDB) -> dict[str, int]:

can we change it to

def _get_morph_collection_priority(am_db: AnkiMorphsDB) -> dict[(str,str) int]:
mortii commented 7 months ago

can we change it to [...]

We would have to rewrite the database structure to store the occurrence of both the base and inflected versions of a word, the caching would have to take that into account, and frequency files would have to have a completely different format.

Vilhelm-Ian commented 7 months ago

Is the frequency format currently a single collum. Because it used to be two Collums.

This things are breaking changes. But if we did that would adding the option "because it would require having two different versions of the algorithm " lead to this. I am guessing in such a case we can just change which value from the topless it treats as the morph

mortii commented 7 months ago

Is the frequency format currently a single collum. Because it used to be two Collums.

no, it should still be two columns. If you only see one then please create a bug report.

Unfortunately, I think this option falls into the same category as the multiple input fields option; valid, but has too many downsides. Sorry brother :pray:

mortii commented 6 months ago

@Vilhelm-Ian in #155 I mentioned this:

Alternatively we could maybe add an option in the 'recalc' settings to switch between two distinct algorithms

And I remembered that that was one of the main reasons I turned down this feature request was because it would be too unwieldy to have multiple difficulty algorithms, but thinking about it some more, I realized that they could just potentially be separate functions, and the function we want to use could just be an input parameter to _update_cards_and_notes().

So if you are able to implement a function that produces the difficulty score that solves this problem, then maybe we can proceed with adding it to AnkiMorphs.

So to clarify: if you can remake _get_card_difficulty_and_unknowns_and_learning_status() to produce your desired results, then we can maybe add that algorithm as a separate option people can choose instead of the current algorithm.

github-actions[bot] commented 5 months ago

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.