paploo / goi

My personal project to manage my 日本語 flashcards in Anki
MIT License
0 stars 0 forks source link

Grammar Rules Deck #14

Closed paploo closed 1 year ago

paploo commented 1 year ago

Need a grammar rules deck.

Each rule should have one or more examples. see JLPT先生 for some structure ideas.

The necessary cards are still TBD.

paploo commented 1 year ago

Let's get some specification together:

Using https://jlptsensei.com/jlpt-n5-grammar-list/ as a guide, a grammar rule is mainly:

  1. Ttile (preferred, phonetic, and kanji spellings)
  2. Meaning
  3. How to Use (a list of templates)
  4. Examples (one-to-many): Japanese (preferred, phonetic, furigana) and English
  5. UUIDs
  6. Types (RULE vs EXAMPLE; and maybe subtypes for kinds of rules in the future?)
  7. Sources (use existing tables)
  8. jlpt_level
  9. Tags (e.g. TE_RULE or NEGATIVE_FORM for things like `ーて + もいいです' and '全然 + negative')

A few thoughts:

  1. Do we want the overkill of the spellings table for these, or just make them one-off fields since this is all we really need? The only downside to not using a spellings table is it will be harder to incorporate into
  2. Is there any way we can encode furigana properly on example? e.g. HTML markup, which can be rendered in Anki, or a simpler markup; how does this affect phonetic playback audio? Do we care about playback at this point? Maybe for examples flashcards.
paploo commented 1 year ago

For furigana, maintaining/reading a sheet might be easier with something like: {行|い}きました or {神|じん}{社|じゃ}.

Alternatively, I could have each separate with per-character tokenization: 行きました and い|き|ま|し|た` or神社`and じん|じゃ`. To fit in one field, we could combine, like:{神社}{じん|じゃ}`

The first:

The second:

The conclusion might be to combine these: 神社に行きました could become: {神社|じん・じゃ}に{行|い}きました This would produce: <ruby>神<rt>じん</rt></ruby><ruby>社<rt>じゃ</rt></ruby><ruby>に</ruby><ruby>行<rt>い</rt></ruby><ruby>きました</ruby> which renders as: じんじゃきました This might be better done as an improvement towards the end, and not scoped in the initial work.

paploo commented 1 year ago

Note that a chunk of work associated with this is:

  1. Rearranging the importers/exporters/transformers into Base, Vocabulary and Grammar namespaces.
  2. Formalizing the idea of a pipeline that combines an importer, transform chain, and multiple exporters to define standard workflows, and then pre-building the standard chain
  3. Retooling (or replacing) the transcoder binaries to use these pipelines.
paploo commented 1 year ago

The PR for the branch wasn't the final one resolving the ticket.