bnosac / udpipe

R package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based on the UDPipe Natural Language Processing Toolkit
https://bnosac.github.io/udpipe/en
Mozilla Public License 2.0
209 stars 33 forks source link

udpipe does not quit gracefully from within rstudio #22

Closed RMHogervorst closed 6 years ago

RMHogervorst commented 6 years ago

I think the udpipe package is great! Unfortunately when I cancel the process from within an r session the whole thing crashes. I think it has to do with the way r and the C++ program communicate?

jwijffels commented 6 years ago

That is because the implementation sends the vector of text to C++ and loops over the text without giving things back to R untill it finished processing. If you install the development version of this package from this github repository (version 0.6), I've added an argument traceto udpipe_annotate. This will print out a message before each tracenumber of elements for which annotation is to be executed, allowing you to see how much of the text is already annotated. Because this prints out something to the R console, you can also press the RStudio stop button.

library(udpipe)
udmodel <- udpipe_download_model(language = "dutch", udpipe_model_repo = "bnosac/udpipe.models.ud")
udmodel <- udpipe_load_model(udmodel$file_model)

data(brussels_reviews, package = "udpipe")
x <- subset(brussels_reviews, language == "nl")
x <- udpipe_annotate(udmodel, x = x$feedback, trace = 10)
## this prints out every 10 text elements to the R console and next you can stop the code with the RStudio stop button if you like to gracefully exit the C++ code

2018-04-09 09:46:46 Annotating text fragment 1/500
2018-04-09 09:46:47 Annotating text fragment 11/500
2018-04-09 09:46:49 Annotating text fragment 21/500
2018-04-09 09:46:50 Annotating text fragment 31/500
2018-04-09 09:46:52 Annotating text fragment 41/500
2018-04-09 09:46:53 Annotating text fragment 51/500
2018-04-09 09:46:54 Annotating text fragment 61/500
2018-04-09 09:46:56 Annotating text fragment 71/500
2018-04-09 09:46:58 Annotating text fragment 81/500
2018-04-09 09:46:59 Annotating text fragment 91/500
2018-04-09 09:47:01 Annotating text fragment 101/500
2018-04-09 09:47:02 Annotating text fragment 111/500
2018-04-09 09:47:04 Annotating text fragment 121/500
2018-04-09 09:47:06 Annotating text fragment 131/500
2018-04-09 09:47:07 Annotating text fragment 141/500
2018-04-09 09:47:09 Annotating text fragment 151/500
2018-04-09 09:47:10 Annotating text fragment 161/500
2018-04-09 09:47:12 Annotating text fragment 171/500
2018-04-09 09:47:14 Annotating text fragment 181/500
2018-04-09 09:47:16
RMHogervorst commented 6 years ago

Thanks! This is very insightful!