Closed jarodtang closed 4 years ago
The specifications for the data formats can be found in the online documentation https://morfessor.readthedocs.io/en/latest/filetypes.html
It seems like you have written an annotation file (although I don't see the point of the repeated identical segmentation alternatives). Annotation files (specified with --annotations
) are additional data used for semi-supervised training. The main training data file is a corpus or a word count list specified with --traindata
.
Hi There,
I tried to craft some simple training like
for testing list as
and got the result as
Where the expected results is
My question is
-R Jarod