Open GoogleCodeExporter opened 9 years ago
I found a solution:
define Etoee e -> é || _ "^" [ \0 & \k ] ; # \0: not zero kor & ként excluded
and
define HarmRuleC C -> á // BackVowel \Vowel* _ %^ [ \0 & \k ] .o. # ként
excluded
C -> é // FrontVowel \Vowel* _ %^ [ \0 & \k ] .o.
C -> a // BackVowel \Vowel* _ %^ [ 0 ] .o.
C -> e // FrontVowel \Vowel* _ %^ [ 0 ] ;
That works, because both special cases start with 'k'.
Is there no way to say:
I want to exclude case '+For' and case '+Tem' from a rule?
Original comment by eleonor...@gmx.net
on 4 Jan 2012 at 11:29
A brief comment: usually, if a rule is phonologically conditioned, it's a good
idea to capture it with a rewrite rule, like you've done.
On the other hand, if you're dealing with an exception, sometimes it's easier
to mark it so in the lexicon, and have the rules bypass the exception. For
example, in this instance, you could have marked those words where e does not
alternate with é as, say E in the lexicon. That is, something like regE
instead of rege. Then the rule won't affect that word, and you can place a rule
like `E -> e` after the other rules. Note that generally, it's most convenient
to place the `E` only on the lower side (because you want the original form on
the lexical side), so the entry should read something like:
{{{
rege:regE
}}}
Minor detail, in `[ \0 & \k ]` the `\0`-part is redundant.
Original comment by mans.hul...@gmail.com
on 4 Jan 2012 at 2:18
The word rege is NOT an exception, but completely regular. The two endings
(+Tem and +For) are the exceptions.
#regét- Acc
#regéhez- All
# and so on...
BUT
#regekor- Tem
#regeként - For
#rege - Nom
I can not fond out, how to say in proper regular expression form:
No ending, or ending "ként" or ending "kor" does not need e->é, all
others do need.
This is ok:
define Etoee e -> é || _ "^" [ \k ] ; # \0: not zero kor & ként excluded
but all trials to expand \k to \(ként) and \(kor) like:
define Etoee e -> é || _ "^" [ \k \é \n \t | \k \o \r] ;
define Etoee e -> é || _ "^" \[ k é n t | k o r ] ;
fail, since
#apply down> rege+Noun+Acc
#reget
gets wrong
How can I say: If no ending or ending = ként or ending = kor, no e->é rule,
otherwise e->é rule?
I am worried, that \k is a bit too un-exact.
Original comment by eleonor...@gmx.net
on 5 Jan 2012 at 9:28
I also tried to add +Abl, +Acc ... +Tem to each word, and then trigger to +Abl,
etc.., no success.
Lexc:
LEXICON Case
+Abl:^tUl+Abl #;
+Acc:^Gt+Acc #;
...
.foma:
define Grammar Lexicon .o.
Etoee ; #.o. Here I stop
Etoee looks:
define Etoee e -> é || .#. \"^"+ _ "^" ?* [ "+" A b l | "+" A c c | "+"
{Ade} | "+" {All} | "+" {Cau} | "+" {Dat} | "+" {Del} | "+" {Ela} | "+" {Fac} |
"+" {For} | "+" {Ill} | "+" {Ine} | "+" {Ins} | "+" {Nom} | "+" {Sub} | "+"
{Sup}| "+" {Ter} ] ?* ;
I try both A b l and {Ade} form, none works
Results:
foma[1]: down
apply down> rege+Noun+Abl
rege^tUl+Abl
apply down> rege+Noun+Acc
rege^Gt+Acc
apply down> rege+Noun+Ade
rege^nDl+Ade
apply down>
no e->é on any place :-(
foma[1]: lower-words
rege^ig+Ter
rege^Pn+Sup
rege^rF+Sub
rege^+Nom
rege^VFl+Ins
rege^bFn+Ine
rege^bF+Ill
rege^ként+For
rege^VD+Fac
rege^bUl+Ela
rege^rUl+Del
rege^nFk+Dat
rege^ért+Cau
rege^hIz+All
rege^nDl+Ade
rege^Gt+Acc
rege^tUl+Abl
Strange is, that I did the same modification on the English lexc/foma files
before:
in lexc:
LEXICON Vinf
+V+PresPart:^ing+PP #;
in foma:
define ConsonantDoubling g -> g g || .#. \"^"+ _ "^" ?* [ "+" {PP} | e d ] ?*;
...
define CleanupPP [ "+" {PP} ] -> 0;
define Grammar Lexicon .o.
ConsonantDoubling .o.
...
CleanupPP .o.
Cleanup;
regex Grammar;
That works perfectly well:
lower-words
beg
begs
begging
begged
begged
I attach both the English and the Hungarian files here.
Original comment by eleonor...@gmx.net
on 5 Jan 2012 at 7:56
Attachments:
I have found a quite well-looking solution. I modified step by step the English
file, until it handled the Hungarian nouns, as it should. We can close this
issue.
Original comment by eleonor...@gmx.net
on 6 Jan 2012 at 8:41
Attachments:
Original issue reported on code.google.com by
eleonor...@gmx.net
on 3 Jan 2012 at 5:57Attachments: