Grammar Rules Deck - Githubissues

paploo commented 1 year ago

Need a grammar rules deck.

Each rule should have one or more examples. see JLPT先生 for some structure ideas.

The necessary cards are still TBD.

paploo commented 1 year ago

Let's get some specification together:

Using https://jlptsensei.com/jlpt-n5-grammar-list/ as a guide, a grammar rule is mainly:

Ttile (preferred, phonetic, and kanji spellings)
Meaning
How to Use (a list of templates)
Examples (one-to-many): Japanese (preferred, phonetic, furigana) and English
UUIDs
Types (RULE vs EXAMPLE; and maybe subtypes for kinds of rules in the future?)
Sources (use existing tables)
jlpt_level
Tags (e.g. TE_RULE or NEGATIVE_FORM for things like `ーて + もいいです' and '全然 + negative')

A few thoughts:

Do we want the overkill of the spellings table for these, or just make them one-off fields since this is all we really need? The only downside to not using a spellings table is it will be harder to incorporate into
Is there any way we can encode furigana properly on example? e.g. HTML markup, which can be rendered in Anki, or a simpler markup; how does this affect phonetic playback audio? Do we care about playback at this point? Maybe for examples flashcards.

paploo commented 1 year ago

For furigana, maintaining/reading a sheet might be easier with something like: {行|い}きました or {神|じん}{社|じゃ}.

Alternatively, I could have each separate with per-character tokenization: 行きました and い|き|ま|し|た｀ or神社｀and じん|じゃ｀. To fit in one field, we could combine, like:{神社}{じん|じゃ}`

The first:

Can be easily mapped to HTML using regex replacement, and
Can fit into one field, but
It is harder to author because it's interlaced.

The second:

Is easier to read and hence maintain,
Is easy to parse because you can split on | and match by index, but
Takes more space (both width due to repeated kana and more fields).

The conclusion might be to combine these: 神社に行きました could become: {神社|じん・じゃ}に{行|い}きました This would produce: <ruby>神<rt>じん</rt></ruby><ruby>社<rt>じゃ</rt></ruby><ruby>に</ruby><ruby>行<rt>い</rt></ruby><ruby>きました</ruby> which renders as: 神じん社じゃに行いきました This might be better done as an improvement towards the end, and not scoped in the initial work.

paploo commented 1 year ago

Note that a chunk of work associated with this is:

Rearranging the importers/exporters/transformers into Base, Vocabulary and Grammar namespaces.
Formalizing the idea of a pipeline that combines an importer, transform chain, and multiple exporters to define standard workflows, and then pre-building the standard chain
Retooling (or replacing) the transcoder binaries to use these pipelines.

paploo commented 1 year ago

The PR for the branch wasn't the final one resolving the ticket.

paploo / goi

Grammar Rules Deck #14