Open ftyers opened 4 years ago
In addition to #78, it would be great to have a tool, let's call it lt-segment that would calculate a segment vocabulary from a .dix file. E.g.
lt-segment
.dix file
... <pardef n="cat__n"> <e><p><l></l><r><s n="n"/><s n="sg"/></r></p></e> <e><p><l>s</l><r><s n="n"/><s n="pl"/></r></p></e> </pardef> <pardef n="m/ouse__n"> <e><p><l>ouse</l><r>ouse<s n="n"/><s n="sg"/></r></p></e> <e><p><l>ice</l><r>ouse<s n="n"/><s n="pl"/></r></p></e> </pardef> <pardef n="happ/y__adj"> <e><p><l>y</l><r>y<s n="adj"/></r></p></e> <e><p><l>ier</l><r>y<s n="adj"/><s n="comp"/></r></p></e> <e><p><l>iest</l><r>y<s n="adj"/><s n="comp"/></r></p></e> </pardef> <e><i>cat</i><par n="cat__n"/></e> <e><i>bat</i><par n="cat__n"/></e> <e><i>happ</i><par n="happ/y__adj"/></e> <e><i>eas</i><par n="happ/y__adj"/></e> <e><i>m</i><par n="m/ouse__n"/></e> <e><i>l</i><par n="m/ouse__n"/></e>
Would produce something like
cat bat happ eas m l @s @ouse @ice @y @ier @iest
It could also be good to have the frequency.
Hmm, with the addition of morpheme boundaries (#89), this should probably just calculate the segments with <m/>, or it could have a "heuristic" mode too that adds them based on paradigm breaks.
<m/>
In addition to #78, it would be great to have a tool, let's call it
lt-segment
that would calculate a segment vocabulary from a.dix file
. E.g.Would produce something like
It could also be good to have the frequency.