ropensci / tokenizers

Fast, Consistent Tokenization of Natural Language Text
https://docs.ropensci.org/tokenizers
Other
185 stars 25 forks source link

Low-level parallelism with RcppParallel #51

Closed lmullen closed 1 year ago

lmullen commented 7 years ago

Just as in quanteda, the functions which call Rcpp versions should be parallelized with RcppParallel. They should all have an argument that sets the number of cores: cores = getOption("mc.cores") probably defaulting to 2 if that option is not set.

Ironholds commented 7 years ago

Want me to take this on?

lmullen commented 7 years ago

If you're willing, that would be great.

On Sat, Apr 29, 2017 at 11:02 PM Oliver Keyes notifications@github.com wrote:

Want me to take this on?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ropensci/tokenizers/issues/51#issuecomment-298207891, or mute the thread https://github.com/notifications/unsubscribe-auth/AALNeBFtSR1ilmqR4BR3FqEPwlvbWv4Lks5r0_nXgaJpZM4NIEBG .

-- Lincoln Mullen Assistant Professor, Department of History & Art History George Mason University

Ironholds commented 6 years ago

Update: I have (finally) found the time to work on this but have run into an issue convincing RVector<String> objects that as<std::string> is an operation that can be performed on its members. So, this is fun?