The last character is being truncated from the input object of koRpus::treetag() when TT.tknz is FALSE. (see "dog" being truncated to "do" in the last row of the output table). This issue is not present when TT.tknz is set to TRUE.
example
doc <- "The quick brown fox jumped over the lazy dog"
# pre bug fix in R/treetag.R
koRpus::treetag(doc, treetagger = "manual", format = "obj",
encoding = "UTF-8", lang = "en", TT.tknz = FALSE,
TT.options = list(path = "/u/application/TreeTagger", preset = "en"))
# token tag lemma lttr wclass desc stop stem
# 1 The DT the 3 determiner Determiner NA NA
# 2 quick JJ quick 5 adjective Adjective NA NA
# 3 brown JJ brown 5 adjective Adjective NA NA
# 4 fox NN fox 3 noun Noun, singular or mass NA NA
# 5 jumped VBD jump 6 verb Verb, past tense of "to be" NA NA
# 6 over IN over 4 preposition Preposition or subordinating conjunction NA NA
# 7 the DT the 3 determiner Determiner NA NA
# 8 lazy JJ lazy 4 adjective Adjective NA NA
# 9 do VBP do 2 verb Verb, non-3rd person singular present of "to be" NA NA
sessionInfo()
# R version 3.3.3 (2017-03-06)
# Platform: x86_64-redhat-linux-gnu (64-bit)
# Running under: Red Hat Enterprise Linux Server 7.3 (Maipo)
packageVersion("koRpus")
# [1] ‘0.10.2’
The issue seems to be stemming from this line in R/treetag.R because there is no return after the contents of the file cat out. Adjusting the function to cat a new line after the user's input object appears to fix the issue. I will submit a pull request with the fix I implemented for review.
The last character is being truncated from the input object of
koRpus::treetag()
whenTT.tknz
isFALSE
. (see "dog" being truncated to "do" in the last row of the output table). This issue is not present whenTT.tknz
is set to TRUE.example
The issue seems to be stemming from this line in R/treetag.R because there is no return after the contents of the file
cat
out. Adjusting the function to cat a new line after the user's input object appears to fix the issue. I will submit a pull request with the fix I implemented for review.