bnosac / udpipe

R package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based on the UDPipe Natural Language Processing Toolkit
https://bnosac.github.io/udpipe/en
Mozilla Public License 2.0
209 stars 33 forks source link

VIGNETTE: thread -> process #63

Closed HenrikBengtsson closed 4 years ago

HenrikBengtsson commented 4 years ago

Hi, in https://github.com/bnosac/udpipe/blob/761cfaec45601bafa081e5621cb2fb90024d1b44/vignettes/udpipe-parallel.Rmd#L66, it says:

It only makes sense to run annotation in parallel if you have many CPU cores and have enough data to annotate. As udpipe models are Rcpp pointers to the loaded models on disk which can not be passed on to the parallel threads, each thread will load the model again which takes some time next to the internal setup of the parallel backend.

I think you mean process here and not thread, e.g. from https://stackoverflow.com/a/200473/1072091:

"The typical difference is that threads (of the same process) run in a shared memory space, while processes run in separate memory spaces."

R does not really do multi-threading - it's possible at the native-code level outside of R but that is a different story.

jwijffels commented 4 years ago

You're right about this. I should change that in the vignette.