Closed tobleu closed 3 years ago
the problem lies in the char
argument in combination with a very short text. if you set char=FALSE
you should be fine.
the text ist way too short for a proper calculation of data for characteristic plots. the first general warning about texts shorter than 100 tokens is there for a reason.
i think it's perhaps time for me to drop this calculation from the defaults as most users shouldn't need them anyway.
fixed with unDocUMeantIt/koRpus@3e6f04b599617b459f26b9bf8cda7dc5650c621b
I want to calculate lexical diversity with koRpus' lex.div function for different texts. I am using the options "keep.tokens=TRUE, type.index=TRUE"; the texts are relatively short (10-150 words). From time to time I get error messages of this kind:
The affected file is here: uF04.txt. It was tagged with TreeTagger before feeding in tag results into lex.div.
While trying to avoid these errors, I run the the same analysis on failed caluclations with different parameters, like these:
which results in a different error (even with additional window and segment sizes reduced to 20):
Reducing the set of measures to a minimal set (even just "TTR") still gives the same error messages and all the progress bars for measures, which should not be included.
Unfortunately I can't trace the error, so I need you help. Thanks a lot in advance!