Closed espenjutte closed 7 years ago
Did you download the model? It looks like the file "dutch-ud-2.0-170801.udpipe" is not on your computer. It is in your current working directory (what does list.files(getwd()) show you?
FYI. That code works perfectly on my machine and all CRAN machines:
> library(udpipe)
Warning message:
package ‘udpipe’ was built under R version 3.4.2
> dl <- udpipe_download_model(language = "dutch")
trying URL 'https://github.com/jwijffels/udpipe.models.ud.2.0/raw/master/inst/udpipe-ud-2.0-170801/dutch-ud-2.0-170801.udpipe'
Content type 'application/octet-stream' length 19992491 bytes (19.1 MB)
downloaded 19.1 MB
> dl
language
1 dutch
file_model
1 \\\\stud-home.icts.kuleuven.be/k0014536/Desktop/R_Statistical_Machine_Learning/dutch-ud-2.0-170801.udpipe
> udmodel_dutch <- udpipe_load_model(file = "dutch-ud-2.0-170801.udpipe")
> x <- udpipe_annotate(udmodel_dutch,
+ x = "Ik ging op reis en ik nam mee: mijn laptop, mijn zonnebril en goed humeur.")
> x <- as.data.frame(x)
> x
doc_id paragraph_id sentence_id sentence
1 doc1 1 1 Ik ging op reis en ik nam mee: mijn laptop, mijn zonnebril en goed humeur.
2 doc1 1 1 Ik ging op reis en ik nam mee: mijn laptop, mijn zonnebril en goed humeur.
3 doc1 1 1 Ik ging op reis en ik nam mee: mijn laptop, mijn zonnebril en goed humeur.
4 doc1 1 1 Ik ging op reis en ik nam mee: mijn laptop, mijn zonnebril en goed humeur.
5 doc1 1 1 Ik ging op reis en ik nam mee: mijn laptop, mijn zonnebril en goed humeur.
6 doc1 1 1 Ik ging op reis en ik nam mee: mijn laptop, mijn zonnebril en goed humeur.
7 doc1 1 1 Ik ging op reis en ik nam mee: mijn laptop, mijn zonnebril en goed humeur.
8 doc1 1 1 Ik ging op reis en ik nam mee: mijn laptop, mijn zonnebril en goed humeur.
9 doc1 1 1 Ik ging op reis en ik nam mee: mijn laptop, mijn zonnebril en goed humeur.
10 doc1 1 1 Ik ging op reis en ik nam mee: mijn laptop, mijn zonnebril en goed humeur.
11 doc1 1 1 Ik ging op reis en ik nam mee: mijn laptop, mijn zonnebril en goed humeur.
12 doc1 1 1 Ik ging op reis en ik nam mee: mijn laptop, mijn zonnebril en goed humeur.
13 doc1 1 1 Ik ging op reis en ik nam mee: mijn laptop, mijn zonnebril en goed humeur.
14 doc1 1 1 Ik ging op reis en ik nam mee: mijn laptop, mijn zonnebril en goed humeur.
15 doc1 1 1 Ik ging op reis en ik nam mee: mijn laptop, mijn zonnebril en goed humeur.
16 doc1 1 1 Ik ging op reis en ik nam mee: mijn laptop, mijn zonnebril en goed humeur.
17 doc1 1 1 Ik ging op reis en ik nam mee: mijn laptop, mijn zonnebril en goed humeur.
18 doc1 1 1 Ik ging op reis en ik nam mee: mijn laptop, mijn zonnebril en goed humeur.
token_id token lemma upos xpos
1 1 Ik ik PRON Pron|per|1|ev|nom
2 2 ging ga VERB V|intrans|ovt|1of2of3|ev
3 3 op op ADP Prep|voor
4 4 reis reis NOUN N|soort|ev|neut
5 5 en en CCONJ Conj|neven
6 6 ik ik PRON Pron|per|1|ev|nom
7 7 nam neem VERB V|trans|ovt|1of2of3|ev
8 8 mee mee ADV Adv|deelv
9 9 : : PUNCT Punc|dubbpunt
10 10 mijn mijn PRON Pron|bez|1|ev|neut|attr
11 11 laptop laptop NOUN N|soort|ev|neut
12 12 , , PUNCT Punc|komma
13 13 mijn mijn PRON Pron|bez|1|ev|neut|attr
14 14 zonnebril zonnebril NOUN N|soort|ev|neut
15 15 en een CCONJ Conj|neven
16 16 goed goed ADJ Adj|attr|stell|onverv
17 17 humeur humeur NOUN N|soort|ev|neut
18 18 . . PUNCT Punc|punt
feats head_token_id dep_rel deps
1 Case=Nom|Number=Sing|Person=1|PronType=Prs 2 nsubj <NA>
2 Aspect=Imp|Mood=Ind|Number=Sing|Subcat=Intr|Tense=Past|VerbForm=Fin 0 root <NA>
3 AdpType=Prep 4 case <NA>
4 Number=Sing 2 obj <NA>
5 <NA> 7 cc <NA>
6 Case=Nom|Number=Sing|Person=1|PronType=Prs 7 nsubj <NA>
7 Aspect=Imp|Mood=Ind|Number=Sing|Subcat=Tran|Tense=Past|VerbForm=Fin 2 conj <NA>
8 PartType=Vbp 7 compound:prt <NA>
9 PunctType=Colo 2 punct <NA>
10 Number=Sing|Person=1|Poss=Yes|PronType=Prs 11 nmod <NA>
11 Number=Sing 2 nsubj <NA>
12 PunctType=Comm 11 punct <NA>
13 Number=Sing|Person=1|Poss=Yes|PronType=Prs 14 nmod <NA>
14 Number=Sing 11 appos <NA>
15 <NA> 17 cc <NA>
16 Degree=Pos 17 amod <NA>
17 Number=Sing 14 conj <NA>
18 PunctType=Peri 2 punct <NA>
misc
1 <NA>
2 <NA>
3 <NA>
4 <NA>
5 <NA>
6 <NA>
7 <NA>
8 SpaceAfter=No
9 <NA>
10 <NA>
11 SpaceAfter=No
12 <NA>
13 <NA>
14 <NA>
15 <NA>
16 <NA>
17 SpaceAfter=No
18 SpacesAfter=\\n
As far as i can see the model downloads correctly and is present in the directory.
list.files(getwd()) lists "dutch-ud-2.0-170801.udpipe" as one of the files in the directory.
Listing the file from the OS also indicates that the .udpipe file is 4.0kb in size. This seems to be rather small for an entire model.
Downloading the model manually seems to do the trick (code runs with expected output). So something with my setup is causing the the udpipe_download_model-command to download wrongly.
This looks like you did not download the model. The model is several megabytes in size. Maybe you are behind a proxy/firewall?
Can you show me all the output of what this does on your computer, including possible warnings/errors that you get?
dl <- udpipe_download_model(language = "dutch")
Running the command just gives me a normal download-progress: % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 190 100 190 0 0 328 0 --:--:-- --:--:-- --:--:-- 328
No warnings or errors. Other downloads are working correctly (for example package downloads).
If i look at the file itself that is downloaded i get:
<html><body>You are being <a href="https://raw.githubusercontent.com/jwijffels/udpipe.models.ud.2.0/master/inst/udpipe-ud-2.0-170801/dutch-ud-2.0-170801.udpipe">redirected</a>.</body></html>
So i'm guessing redirects are not being followed correctly by curl for some reason.
Bizarre. Can you show what this these 4 things do on your computer: Because that is basically what udpipe_download_model does if you want the dutch language model
utils::download.file("https://raw.githubusercontent.com/jwijffels/udpipe.models.ud.2.0/master/inst/udpipe-ud-2.0-170801/dutch-ud-2.0-170801.udpipe", "dutch-ud-2.0-170801.udpipe", mode = "wb")
utils::download.file("https://raw.githubusercontent.com/jwijffels/udpipe.models.ud.2.0/master/inst/udpipe-ud-2.0-170801/dutch-ud-2.0-170801.udpipe", "dutch-ud-2.0-170801.udpipe", mode = "wb", method = "internal")
utils::download.file("https://raw.githubusercontent.com/jwijffels/udpipe.models.ud.2.0/master/inst/udpipe-ud-2.0-170801/dutch-ud-2.0-170801.udpipe", "dutch-ud-2.0-170801.udpipe", mode = "wb", method = "wininet")
utils::download.file("https://raw.githubusercontent.com/jwijffels/udpipe.models.ud.2.0/master/inst/udpipe-ud-2.0-170801/dutch-ud-2.0-170801.udpipe", "dutch-ud-2.0-170801.udpipe", mode = "wb", method = "libcurl")
utils::download.file("https://raw.githubusercontent.com/jwijffels/udpipe.models.ud.2.0/master/inst/udpipe-ud-2.0-170801/dutch-ud-2.0-170801.udpipe", "dutch-ud-2.0-170801.udpipe", mode = "wb", method = "curl")
Note to myself. I think the fix to this might be to replace:
https://github.com/jwijffels/udpipe.models.ud.2.0/raw/master with https://raw.githubusercontent.com/jwijffels/udpipe.models.ud.2.0/master in the code of udpipe_download_model
I've update the package to download models from https://raw.githubusercontent.com/jwijffels/udpipe.models.ud.2.0/master instead of the link https://github.com/jwijffels/udpipe.models.ud.2.0/raw/master which was apparently redirected to https://raw.githubusercontent.com/jwijffels/udpipe.models.ud.2.0/master
Can you check on your machine if with the latest version this now works and downloads the model which should be several Mb in size.
devtools::install_github("bnosac/udpipe", build_vignettes = TRUE)
library(udpipe)
dl <- udpipe_download_model(language = "dutch")
Leaving a note here for others to find: I had the exact same problem, except the error was thrown after I had been using the model for hundreds of thousands of calls; It had clearly downloaded correctly, but simply stopped working. Simply redownloading the model fixed the problem:
init_model <- function(lang = 'french')
{
udmodel <<- udpipe_download_model(language = lang)
udmodel <<- udpipe_load_model(file = udmodel$file_model)
}
and I was able to get on with the other hundred-thousand set of sentences in my project.
Probably you restarted your R session without knowing. The udpipe models are pointers to file on your hard disk. If you restart your R session, that pointer is lost, that is why you need to reload it using udmodel <- udpipe_load_model(file = "/path/to/the/model")
That has to be it. Strangely though, the model was still in global memory, as the R session, if it crashed, reloaded to a similar state.
To explain: I had executed a long-running loop overnight. In the morning, my rstudio-server webpage had crashed, but reloading it, everything was fine: the code had successfully finished, the console's content was there, and everything in the global env was in memory. Usually, if the R session had to restart, there's a message mentioning it in the console. It wasn't the case. I tried to launch my loop again for the couple of thousand remaining cases, and this is when I saw the error. The working directory hadn't changed, the model file was still at the same place.
This might be an R problem, and not a udpipe problem. Something might have happened to to the in-memory data server-side and needed a manual reload.
Like I said, udpipe models are Rcpp pointers to files on disk. If you restart your R session these pointers are lost, no matter how you restarted (from a crash, just a regular restart, automatically as RStudio does or by reloading an .RData file at startup). You always need to reload a model from disk with udmodel <- udpipe_load_model(file = "/path/to/the/model") if you restart R.
My bad, I did not understand that by pointer you meant actual rcpp pointers. I forgot that udpipe is C++ under the hood! Thanks a lot for the help!
When running the example-code for udpipe i get the following error:
Error in udp_tokenise_tag_parse(object$model, x, doc_id, tokenizer, tagger, : external pointer is not valid
Steps to reproduce: library(udpipe) dl <- udpipe_download_model(language = "dutch") dl udmodel_dutch <- udpipe_load_model(file = "dutch-ud-2.0-170801.udpipe") x <- udpipe_annotate(udmodel_dutch, x = "Ik ging op reis en ik nam mee: mijn laptop, mijn zonnebril en goed humeur.") x <- as.data.frame(x) x
I'm using Microsoft R Open - R version 3.4.0 (2017-04-21).