statsmaths / cleanNLP

R package providing annotators and a normalized data model for natural language processing
GNU Lesser General Public License v2.1
209 stars 36 forks source link

Error with cnlp_annotate #73

Closed noahtolsen closed 4 years ago

noahtolsen commented 4 years ago

I'm trying to use cleanNLP with the spacy backend with the en_core_web_sm model however when I run even a small test example like:

library(cleanNLP) reticulate::use_condaenv("miniconda3", required = T) cleanNLP::cnlp_init_spacy("en_core_web_sm") data <- cleanNLP::un annotated <- cleanNLP::cnlp_annotate(data, text_name = 'text', doc_name = 'doc_id')

I'm getting the error message:

Error in py_call_impl(callable, dots$args, dots$keywords) : TypeError: '>=' not supported between instances of 'int' and 'str'

Am I doing something wrong or is this a bug?

statsmaths commented 4 years ago

No, nothing wrong on your end. It appears to be a bug in the newest version of the package, but it would only appear if you manually specify the model name for spacy. I just pushed a fix and you should be able to solve the problem by installing from GitHub:

remotes::install_github("statsmaths/cleanNLP")

Please let me know if that solves (or doesn't) your problem.

noahtolsen commented 4 years ago

Works great, really appreciate the quick fix!