bnosac / udpipe

R package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based on the UDPipe Natural Language Processing Toolkit
https://bnosac.github.io/udpipe/en
Mozilla Public License 2.0
209 stars 33 forks source link

Error in udpipe_annotate: "external pointer is not valid" #33

Closed eastsea17 closed 5 years ago

eastsea17 commented 5 years ago

Problem description I try to annotate the text using R example code from the guidance below. (https://cran.r-project.org/web/packages/udpipe/vignettes/udpipe-annotation.html#udpipe_the_c++_library) A week ago, the code runs very well. Any errors didn't occur.

But, suddenly, error output appear in R console. I don't know the reason why the code stops. I have reinstalled R and R Studio several times to delete the hidden file etc. in many directories.

Sometimes, the R studio console even stops displaying any output after executing udpipe_annotate. No error outputs are produced in R console.

Other R codes run very well. Only the UDPIPE code doesn't work. How can I solve the problem? (A week ago, I used R 3.3.2, I'm now using R 3.5.1)

First Error: Error output appear in R console

library(udpipe) dl <- udpipe_download_model(language = "dutch")

Downloading udpipe model from https://raw.githubusercontent.com/jwijffels/udpipe.models.ud.2.0/master/inst/udpipe-ud-2.0-170801/dutch-ud-2.0-170801.udpipe to C:/Analysis/R_Analysis/2018년/20181101_udpipe/dutch-ud-2.0-170801.udpipe trying URL 'https://raw.githubusercontent.com/jwijffels/udpipe.models.ud.2.0/master/inst/udpipe-ud-2.0-170801/dutch-ud-2.0-170801.udpipe' Content type 'application/octet-stream' length 19992491 bytes (19.1 MB) downloaded 19.1 MB

str(dl)

'data.frame': 1 obs. of 3 variables: $ language : chr "dutch" $ file_model: chr "./20181101_udpipe/dutch-ud-2.0-170801.udpipe" $ url : chr "https://raw.githubusercontent.com/jwijffels/udpipe.models.ud.2.0/master/inst/udpipe-ud-2.0-170801/dutch-ud-2.0-170801.udpipe"

Either give a file in the current working directory

udmodel_dutch <- udpipe_load_model(file = "dutch-ud-2.0-170801.udpipe")

Or give the full path to the file

udmodel_dutch <- udpipe_load_model(file = dl$file_model) dl$file_model

[1] "./20181101_udpipe/dutch-ud-2.0-170801.udpipe"

txt <- c("Ik ben de weg kwijt, kunt u me zeggen waar de Lange Wapper ligt? Jazeker meneer",

  • "Het gaat vooruit, het gaat verbazend goed vooruit") x <- udpipe_annotate(udmodel_dutch, x = txt) Error in udp_tokenise_tag_parse(object$model, x, doc_id, tokenizer, tagger, : external pointer is not valid
    1. stop(structure(list(message = "external pointer is not valid", call = udp_tokenise_tag_parse(object$model, x, doc_id, tokenizer, tagger, parser, log_every, log_now), cppstack = structure(list( file = "", line = -1L, stack = "C++ stack not available on this system"), class = "Rcpp_stack_trace")), class = c("Rcpp::exception", ...
    2. udp_tokenise_tag_parse(object$model, x, doc_id, tokenizer, tagger, parser, log_every, log_now)
    3. udpipe_annotate(udmodel_dutch, x = txt)

Second Error: R studio console even stops displaying any output after executing udpipe_annotate

ud_model <- udpipe_load_model(file = "english-ud-2.0-170801.udpipe") x <- udpipe_annotate(ud_model, x = comments$comments) dd dd dd

Expected results or ordinary results are below.

ddd Error: object 'ddd' not found dd Error: object 'dd' not found dd Error: object 'dd' not found

jwijffels commented 5 years ago

Looks to me like you do not give the full path to the model when you provide udpipe_load_model or you have a model somewhere else which is not in your current working directory. So provide a full path to that model when you do udpipe_load_model

eastsea17 commented 5 years ago

I have already executed udpipe_load_model with right full path like an example below.

Either give a file in the current working directory udmodel_dutch <- udpipe_load_model(file = "dutch-ud-2.0-170801.udpipe") Or give the full path to the file udmodel_dutch <- udpipe_load_model(file = dl$file_model) udmodel_dutch <- udpipe_load_model(file = "C:/...")

I still get an error " Error in udp_tokenise_tag_parse(object$model, x, doc_id, tokenizer, tagger, : external pointer is not valid". More traceback massages are below.

  1. stop(structure(list(message = "external pointer is not valid", call = udp_tokenise_tag_parse(object$model, x, doc_id, tokenizer, tagger, parser, log_every, log_now), cppstack = structure(list( file = "", line = -1L, stack = "C++ stack not available on this system"), class = "Rcpp_stack_trace")), class = c("Rcpp::exception", ...
  2. udp_tokenise_tag_parse(object$model, x, doc_id, tokenizer, tagger, parser, log_every, log_now)
  3. udpipe_annotate(ud_model, x = comments$words)

When I click "Return with Debug", the code windows open like below

function (udmodel, x, docid, annotation_tokenizer, annotation_tagger, annotation_parser, log_every, current_time) { .Call("_udpipe_udp_tokenise_tag_parse", PACKAGE = "udpipe", udmodel, x, docid, annotation_tokenizer, annotation_tagger, annotation_parser, log_every, current_time) }

I tried various codes, but I couldn't have any appropriate responses like an example in official website of udpipe.

jwijffels commented 5 years ago

Can you copy-paste what does this give for you?

getwd()
list.files(full.names = TRUE)
udmodel_dutch <- udpipe_load_model(file = "dutch-ud-2.0-170801.udpipe")
udpipe::udpipe_annotate(udmodel_dutch, "Het gaat vooruit, het gaat verbazend goed vooruit")
eastsea17 commented 5 years ago

It works well. I think that my r studio have some problems. Sometimes, r studio console stops. Thank you so much.

SubhasreeUC commented 5 years ago

I am using the English model (english-ewt-ud-2.3-181115.udpipe) to do the same thing in RStudio and I can't make it work! I have tried all the options mentioned above but getting the same error: "Error in udp_tokenise_tag_parse(object$model, x, doc_id, tokenizer, tagger, : external pointer is not valid"

jwijffels commented 5 years ago

@SubhasreeUC What's your code?

SubhasreeUC commented 5 years ago

library(udpipe) model <- udpipe_download_model(language = "english") udmodel_english <- udpipe_load_model(file = '/Users/chattes4/Documents/english-ewt-ud-2.3-181115.udpipe') txt <- as.character(data$feedback) s <- udpipe::udpipe_annotate(udmodel_english, txt)

jwijffels commented 5 years ago

How big is your english-ewt-ud-2.3-181115.udpipe file in megabytes? It should be 16.7 MB.

library(udpipe)
model <- udpipe_download_model(language = "english")
udmodel_english <- udpipe_load_model(file = model$file_model)
txt <- as.character(data$feedback)
s <- udpipe::udpipe_annotate(udmodel_english, txt)
SubhasreeUC commented 5 years ago

It is 15mb

jwijffels commented 5 years ago

It should be 16.7 MB. Run udpipe_download_model(language = "english") again and make sure you have an internet connection. It downloads the model from the internet.

SubhasreeUC commented 5 years ago

While downloading in R it shows it has downloaded 16.7 MB. But when I check in application it shows 15 MB. Is that the problem?

jwijffels commented 5 years ago

What do you think? The model is downloaded from here https://github.com/jwijffels/udpipe.models.ud.2.3/tree/master/inst/udpipe-ud-2.3-181115. If your internet connection is crappy, just download it manually.

SubhasreeUC commented 5 years ago

Thanks! That helped! Now in application the file size is 17.5mb, but the functions worked.