statsmaths / cleanNLP

R package providing annotators and a normalized data model for natural language processing
GNU Lesser General Public License v2.1
209 stars 36 forks source link

Fix cnlp_annotate when given tif input. #36

Closed reisner closed 5 years ago

reisner commented 6 years ago

cnlp_annotate works when given as_strings = TRUE, but not for tif input. Using the example code in the README gives this error:

library(cleanNLP)
text <- c("It is better to be looked over than overlooked.",
         "Real stupidity beats artificial intelligence every time.",
         "The secret of getting ahead is getting started.")
tif_input <- data.frame(doc_id = c("West", "Pratchett", "Twain"),
                        text = text,
                        stringsAsFactors = FALSE)
cnlp_init_udpipe()
obj <- cnlp_annotate(tif_input)
  ...
Error in if (length(non_meta_cols) < ncol(input)) { :
  argument is of length zero

This is because input <- input[[text_var]] happens before input is interrogated for meta data columns. Moving this to below the metadata processing fixes the issue.