mortii / anki-morphs

A MorphMan fork rebuilt from the ground up with a focus on simplicity, performance, and a codebase with minimal technical debt.
https://mortii.github.io/anki-morphs/
Mozilla Public License 2.0
52 stars 7 forks source link

Adjust the readability report generator to be more inline with the original morphman #130

Closed alexnigelsmith closed 7 months ago

alexnigelsmith commented 7 months ago

Could you potentially update the readability report generator to be more inline with the original morphman?

See image below: image

Main features:

  1. You can set a target percentage of known words to reach. The add-on will then create a study plan to get you there in the most efficient way possible. For example, if you only have 1 word to learn in episode 1, and the sentence is i+2 in Anki, the add-on will attempt to find a sentence to learn in order for that sentence in episode 1 to become i+1.
  2. You can import a frequency list e.g. Netflix frequency list to limit your words to top 5k words etc. If you don't import a frequency list, it will instead use the show's frequency list.

Not sure if these have been implemented as of yet, but this was the main reason I was using the original morphman for!

See videos below for more detail: https://youtu.be/W9e-V6Q1AeU?si=h1DAGgFckjrKif7_ https://youtu.be/ey2nsLTH1jM?si=K7P0Bwii_IMpcooY

Vilhelm-Ian commented 7 months ago
  1. The first we have discussed with morty. Hee found a better alternative
  2. Can you clarify it better. I think we already do that. But if you clarify I will get what we are missing
mortii commented 7 months ago

@alexnigelsmith Thanks for the suggestions!

First I just want to say that I think the morphman readability analyzer was waaay too complicated, and my goal was to simplify it.

You can set a target percentage of known words to reach.

This option in morphman is functionally incomprehensible imo. I'm amazed anyone figured out what it was and how to use it. I'm not going to add it to ankimorphs, at least not in the same way.

@Vilhelm-Ian and I discussed an alternative in #114. I think having a percentile of how much of the text text you know in the margin could be cool.

The add-on will then create a study plan to get you there in the most efficient way possible. For example, if you only have 1 word to learn in episode 1, and the sentence is i+2 in Anki, the add-on will attempt to find a sentence to learn in order for that sentence in episode 1 to become i+1.

Creating a frequency file from the media will do the same thing, I think. No need for a study plan.

You can import a frequency list e.g. Netflix frequency list to limit your words to top 5k words etc. If you don't import a frequency list, it will instead use the show's frequency list.

Are you saying you limit the output of the readability report based on an imported frequency list?

alexnigelsmith commented 7 months ago

@alexnigelsmith Thanks for the suggestions!

No problem! I loved the original version of Morphman and would like to switch over to using anki-morphs! First I just want to say that I think the morphman readability analyzer was waaay too complicated, and my goal was to simplify it.

You can set a target percentage of known words to reach.

This option in morphman is functionally incomprehensible imo. I'm amazed anyone figured out what it was and how to use it. I'm not going to add it to ankimorphs, at least not in the same way.

@Vilhelm-Ian and I discussed an alternative in #114. I think having a percentile of how much of the text text you know in the margin could be cool.

I've just read the number 4 on https://github.com/mortii/anki-morphs/discussions/114 and that's basically exactly what I meant. The percentage is the target comprehension you would like to reach in your chosen media. This can also work really well with the minimum frequency option. An example:

  1. Up until around 4K knowns words, it's probably best to use the media's own frequency list, as this is what will increase your comprehension the highest. The target percentage would be set to around 96%.
  2. From 4k+ set the "Master Frequency List" to Netflix (or any other frequency list) and "Minimum Master Frequency" to 0 and Target % growing from 96% to 98%. This will givs a frequency based on the media but weighted by the Netflix order.
  3. Once you get a show to 98%, switch to "Minimum Master Frequency" to 250 or so, and Target % of 100%. This is intended to pick up any remaining somewhat useful words in the show.

This was quite a common strategy used by people who used the original morphman and it worked extremely well. I hope that made sense!

The add-on will then create a study plan to get you there in the most efficient way possible. For example, if you only have 1 word to learn in episode 1, and the sentence is i+2 in Anki, the add-on will attempt to find a sentence to learn in order for that sentence in episode 1 to become i+1.

Creating a frequency file from the media will do the same thing, I think. No need for a study plan.

I agree to some extent! I just think it's quite useful to see which words from each episode of a show you still need to learn. Possible just a quality of life thing, as I can imagine generating a frequency list is exactly the same.

You can import a frequency list e.g. Netflix frequency list to limit your words to top 5k words etc. If you don't import a frequency list, it will instead use the show's frequency list.

Are you saying you limit the output of the readability report based on an imported frequency list?

The imported frequency list will just change the weighting of the words to learn. The media's frequency list will still be generated, but the order of the study-plan will change depending on the frequency list used. This can be used in various ways such as the example above.

mortii commented 7 months ago

I've just read the number 4 on #114 and that's basically exactly what I meant. The percentage is the target comprehension you would like to reach in your chosen media. This can also work really well with the minimum frequency option. An example:

  1. Up until around 4K knowns words, it's probably best to use the media's own frequency list, as this is what will increase your comprehension the highest. The target percentage would be set to around 96%.

    1. From 4k+ set the "Master Frequency List" to Netflix (or any other frequency list) and "Minimum Master Frequency" to 0 and Target % growing from 96% to 98%. This will givs a frequency based on the media but weighted by the Netflix order.

    2. Once you get a show to 98%, switch to "Minimum Master Frequency" to 250 or so, and Target % of 100%. This is intended to pick up any remaining somewhat useful words in the show.

[...]

The imported frequency list will just change the weighting of the words to learn. The media's frequency list will still be generated, but the order of the study-plan will change depending on the frequency list used. This can be used in various ways such as the example above.

Sorry, but I think these are unnecessarily complicated. This classifies as overfitting imo, a protocol that has been overly optimized around a bad system/tools. I don't want to add features to retrofit that approach, instead I want to make something that will fundamentally work better, in a way that makes protocols like that unnecessary.

I agree to some extent! I just think it's quite useful to see which words from each episode of a show you still need to learn. Possible just a quality of life thing, as I can imagine generating a frequency list is exactly the same.

We'll work on making more detailed reports for the individual items (like showing specific morphs, etc.), hopefully it might tick off some of your boxes.

I'll start contributing to #114 soon, and any if you ever have any feedback to the work there then I'd love to hear it! :pray:

github-actions[bot] commented 6 months ago

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.