unDocUMeantIt / koRpus

An R Package for Text Analysis
GNU General Public License v3.0
45 stars 6 forks source link

TT.tokenizer not found #37

Closed SimonWulp closed 2 years ago

SimonWulp commented 2 years ago

When running the following code:

set.kRp.env(TT.cmd="manual", TT.options=list(path="c://treetagger", preset="nl"), lang="nl")
res <- treetag(
  file=words,
  treetagger="kRp.env",
  format="obj",
  debug = TRUE
)

I get the following error:

Error in preset.definition[["preset"]](TT.cmd = TT.cmd, TT.bin = TT.bin, : 
object 'TT.tokenizer' not found

The words variable is a vector containing dutch words that are to be lemmatized. I get the same result on both the stable and the development versions. The above error is on Windows 10, perhaps it has something to do with Windows since the above code does work on Linux. When I run the same code with words set to an vector of English words an preset and lang set to "en", no error is given.

Can't really seem to find the problem, hopefully you can help. Thanks in advance!

unDocUMeantIt commented 2 years ago

thanks for reporting this!

i believe this was a bug in the dutch language package, could you please try

devtools::install_github("unDocUMeantIt/koRpus.lang.nl", ref="develop")

to check in a fresh R session if i sucessfully fixed it?

SimonWulp commented 2 years ago

That did the trick, thanks a lot!

unDocUMeantIt commented 2 years ago

thanks for the feedback, the fixed package is now officially released as koRpus.lang.nl 0.1-6.