Closed GoogleCodeExporter closed 8 years ago
This same problem with "rechn-" and "ebn-".
Original comment by wuerz...@gmail.com
on 30 Aug 2011 at 11:35
I disagree with this. Dictionary forms, and Ordnung is a dictionary form,
should be displayed in any case. Ordn is not a dictionary form, anyway,
therefore the expected output is for me a nonsense.
Original comment by glukri...@gmx.de
on 30 Aug 2011 at 12:31
A question to @wuerz:
What is your purpose with morphisto? What are you using it for?
Original comment by eleonor...@gmx.net
on 30 Aug 2011 at 12:34
Argh, of course your right. The expected output should be like this:
Expected:
> Ordnung
o:Ordne:<>n:<><V>:<>ung<SUFF>:<><+NN>:<><Fem>:<><Nom>:<><Sg>:<>
o:Ordne:<>n:<><V>:<>ung<SUFF>:<><+NN>:<><Fem>:<><Gen>:<><Sg>:<>
o:Ordne:<>n:<><V>:<>ung<SUFF>:<><+NN>:<><Fem>:<><Dat>:<><Sg>:<>
o:Ordne:<>n:<><V>:<>ung<SUFF>:<><+NN>:<><Fem>:<><Akk>:<><Sg>:<>
or without "-b": ordnen<V>ung<SUFF><+NN>.
Nonetheless, "Ordnung" is a transparent derivation of "ordnen" and therefore
potentially superfluous in the lexicon. Although, it will be kept in the
trunk's dictionary. My purpose with morphisto is to reduce its lexicon to the
morphological simple entries by removing all derivations and compounds to be
able to perform real morphological segmentation. Of course I am only doing this
in the branch "kmw" and also in a transparent, reconstructable way. All of the
fixes you are committing will be included in the main development branch.
On the other hand I really doubt that the lexicon of any morphological analysis
tool should be blown up with words like "Abschiebegewahrsam". I can think of no
application for morphisto where on would need this entry since morphisto does
not come along with some word semantics.
Original comment by wuerz...@gmail.com
on 30 Aug 2011 at 12:46
Thanks for the explanation. If such changes only go in a special branch, I have
no problem with that.
For any translation project is Ordnung like all other dictionary words a must
to have.
I agree with you, Abschiebegewahrsam is a true compound, that can be in
morphisto as a sum of two words, like now.
Exceptions are words like Hammelsprung, that may mean something very different,
than the sum of the words it is created from.
Original comment by eleonor...@gmx.net
on 30 Aug 2011 at 2:04
@CWRSimon Could you please have a look at this issue? I browsed through
phon.fst but did not find an appropriate rule.
Original comment by wuerz...@gmail.com
on 31 Aug 2011 at 9:03
The problem does not exist for "atm-". It seems to be specific for verbal stems
on "-n".
Original comment by wuerz...@gmail.com
on 31 Aug 2011 at 11:23
The evil rule lives in defaults.fst:
$R$ = ([bdgptkfs] | ch) n <=> <en> (<V>)
It is thus possible to derive "Ordenung":
> Ordenung
ordnen<V>ung<SUFF><+NN><Fem><Nom><Sg>
ordnen<V>ung<SUFF><+NN><Fem><Gen><Sg>
ordnen<V>ung<SUFF><+NN><Fem><Dat><Sg>
ordnen<V>ung<SUFF><+NN><Fem><Akk><Sg>
Any ideas?
Original comment by wuerz...@gmail.com
on 31 Aug 2011 at 6:49
> ordentlich
ordentlich<+ADJ><Pos><Adv>
ordentlich<+ADJ><Pos><Pred>
> ordenlich
ordnen<V>lich<SUFF><+ADJ><Pos><Adv>
ordnen<V>lich<SUFF><+ADJ><Pos><Pred>
Original comment by wuerz...@gmail.com
on 8 Sep 2011 at 1:30
> ordenlich
ord<>:ene:<>n:<><V>:<>lich<SUFF>:<><+ADJ>:<><Pos>:<><Adv>:<>
ord<>:ene:<>n:<><V>:<>lich<SUFF>:<><+ADJ>:<><Pos>:<><Pred>:<>
> Kettung
k:Kette:<>n:<><V>:<>ung<SUFF>:<><+NN>:<><Fem>:<><Nom>:<><Sg>:<>
k:Kette:<>n:<><V>:<>ung<SUFF>:<><+NN>:<><Fem>:<><Gen>:<><Sg>:<>
k:Kette:<>n:<><V>:<>ung<SUFF>:<><+NN>:<><Fem>:<><Dat>:<><Sg>:<>
k:Kette:<>n:<><V>:<>ung<SUFF>:<><+NN>:<><Fem>:<><Akk>:<><Sg>:<>
Original comment by wuerz...@gmail.com
on 8 Sep 2011 at 1:32
Added the following rule to phon.fst:
$R11b$ = ([bdgptkfs] | ch) <e> <=> <> ([n] (<CB>|$Bound$) [aeiou])
This rule deletes the previously inserted "<e>" (which is to be understood as a
request for e insertion in phon.fst) under certain circumstances (i.e. if the
follwing morph starts with a vowel). Now, we are able to distinguish
"orden-*b*ar" from "Ordn-*ung*" and the non-existent word "Ordenung" can not be
derived anymore.
Original comment by wuerz...@gmail.com
on 9 Sep 2011 at 11:02
Issue 17 has been merged into this issue.
Original comment by wuerz...@gmail.com
on 9 Sep 2011 at 2:22
Original issue reported on code.google.com by
wuerz...@gmail.com
on 30 Aug 2011 at 11:13