Vermoot / Pluvier

French steno theory/dictionary for Plover
GNU General Public License v3.0
23 stars 5 forks source link

Some words from the book are not in `tao_la_salle.json` #2

Open TomT-homas opened 2 years ago

TomT-homas commented 2 years ago

Hello, /S*ES is "zest" instead of "c'est cette" (as explain in "la tao cropped")

EDIT: 1) /TR is "terre", /T-R is "interest". /TR should be "intérêt", and /TR should be "terre"

2) /SR is "serre", /SR should be "sur" but : /SUR is "sûr", and /S*UR is "sur". I think /SUR should be "sur", /S*UR should be "sûr", and let /SR for "serre". What do you think ?

3) I discovers that after a verb or a noun, /-S makes an ending "s". That's great. But in a phrase like /TU/AU/-S, that's "tu a eus". Not great. It should be "tu as eu". (for now, I have not began to do words in many syllabes, so I don't really know how it works : I want to propose you /AU/-S is "as eu", but I will when I'll be into).

{ "TR": "intérêt", "T-R": "terre", "SUR": "sur", "S*UR": "sûr" }

EDIT FROM VERMOOT: I took the liberty of editing your comment to add backtick in some places where they were needed for readability. As a general rule try to write any steno outlines between backticks ;)

Vermoot commented 2 years ago

Alright so this seems to be because the word isn't in @morinted 's dictionary in which he compiled the book's outlines. If there are any more missing words, this should be the issue in which we should track those.

Vermoot commented 2 years ago

Here are the words we've found so far. I'll compile every missing word here, and cross the ones we've added:

{
"S*ES": "c'est cette",
"ET": "été",
"WALT": "va-t-il",
"KOUT": "coûte",
"KOUF": "couve",
"-BLGZ": "quel",
"TK-DZ": "de dire",
"TPO": "faut",
"TPRO*": "ferons",
"-PBZ": "nos",
"S-Z": "c'est sa",
"PA*EU": "paie"
}

Some misspelled entries as well:

-"ORPL": "homme",
+"OEPL": "homme",

-"PWAR/R-R": "can",
+"KAB/R-R": "can",
Vermoot commented 2 years ago

1) /TR is "terre", /T-R is "interest". /TR should be "intérêt", and /TR should be "terre"

This is a different issue, those translations are not missing from tao_la_salle.json, but they're being overwritten by generated translations. I started issue #4 to keep track of this.

/SR is "serre", /SR should be "sur" but : /SUR is "sûr", and /SUR is "sur". I think /SUR should be "sur", /SUR should be "sûr", and let /SR for "serre". What do you think ?

SR being "serre" right now is the same issue, the brief for "sur" is being overwritten by the generated translation.
As for your proposed outlines for "sur", "sûr" and "serre", I think the initial briefs for "sur" and "sûr" are good, and for now, at least while we're still so early in dictionray generation, I think we should keep them as is. "serre" should be SAEUR in my opinion, which would be theory-consistent. Right now it's "cher", which is also SHAEUR (and that's actually the outline I'd have gone for.

@stl74, any idea why "cher" is being generated as SAEUR? I can't find the rule for S- -> "ch" again.

I discovers that after a verb or a noun, /-S makes an ending "s". That's great. But in a phrase like /TU/AU/-S, that's "tu a eus". Not great. It should be "tu as eu". (for now, I have not began to do words in many syllabes, so I don't really know how it works : I want to propose you /AU/-S is "as eu", but I will when I'll be into).

That's also outside of the scope of this issue. I kinda like your proposition about "as eu" but I think this would be difficult to implement as a rule during dictionary generation, since we don't really have a way of knowing where the pluralization should happen in cases like this. I've created issue #5 about proposed briefs so we can add this kind briefs that people think of in addition to the ones found in tao_la_salle.json.

TomT-homas commented 2 years ago

Hello, I continue my lessons this week, and here are what I’ve found so far :

{
"ha,": "/HAFPLT"
"can": "/KAB/R-R"
".": "/-FLPT"
",": "/-RBGS"
"va-t-il": "/WALT"
"elle": "/HR-"
"plusieurs": "/PHR-"
"loue": "/HRO*U"
"cloue": "/KHRO*U"
"Proulx": "/PRO*U"
"coûte": "/KOUT"
"couve": "/KOUV"
"court": "/KO*UR"
}

I follow the lessons, and I’m critic. For example, she tolds us the names take an *, and then there are names without : "Hugues"="UG", and "Liz"="/LIZ". It’s okay if there are exeptions, but she don’t told us. Same for some verbs.

This system is really a hack (bidouillage) between phonetic and abbreviation, that’s great. I’m gonna do a french’s explanations of the strokes.

TomT-homas commented 2 years ago

Hello, I continue my lessons this last week, and here are what I’ve found so far :

{
"banque": "/PW"
"nombre": "/PWR"
"combien": "/KPW"
"bout": "/PWO*U"
"homme": "/OEPL"
"elle": "/HR-"
"loue": "/PHO*U"
"cloue": "/KPHO*U"
"verre": "/V-R"
"banquière": "/BA*ER"
"qui": "/KR"
"ce qui": "/SKR"
"quel": "/-BLG"
"brûle": "/PWRUL"
"de dire": "/TK-DZ"
"en": "/TPH"
"faut": "/TPO"
"feront": "/TPRO*"
"nos": "/PBZ"
"c’est sa": "/S-Z"
"fête": "/TPET"
"paie": "/PA*EU
}

I begin to do more than one-stroke-word !

{
"étain": "/ET/EUPB"
"éteint": "/ET/*EUPB"
"Élyse": "/EL/*EUZ"
"formidable": "/TPORPL/PWABL"
}

And then, the number’s problem : they are in figures, not in letters, for those I’ve tried so far…

{
"deux": "TKAO"
"trois": "/TROEUZ"
"quatre": "/KATS"
"huit": "/AUT"
"douze": "/TKOUZ"
"quatorze": "/KORZ"
"seize": "SAEUZ"
"vingt": "VR-"
}

See you next week !

Vermoot commented 2 years ago

I've updated my comment above with what you've found. I've told you on Discord but I'll say it here again for the record: lots of those words were in tao_la_salle.json already so I haven't re-compiled them. Some were just a problem with priority (see #4, fixed in 4d071000a183e382d630a187f4bb6a8ced0a3370 )

Vermoot commented 2 years ago

I follow the lessons, and I’m critic. For example, she tolds us the names take an *, and then there are names without : "Hugues"="UG", and "Liz"="/LIZ". It’s okay if there are exeptions, but she don’t told us. Same for some verbs.

I think there's bound to be plenty of exceptions and inconsistencies in there. I guess we might as well try and keep true to the rules as much as we can, but maybe at some point we'll understand why they did that.

Vermoot commented 2 years ago

And then, the number’s problem : they are in figures, not in letters, for those I’ve tried so far…

{
"deux": "TKAO"
"trois": "/TROEUZ"
"quatre": "/KATS"
"huit": "/AUT"
"douze": "/TKOUZ"
"quatorze": "/KORZ"
"seize": "SAEUZ"
"vingt": "VR-"
}

See #6