crew102 / rapidraker

A fast version of the Rapid Automatic Keyword Extraction (RAKE) algorithm
https://crew102.github.io/slowraker/articles/rapidraker.html
Other
1 stars 0 forks source link

Parallel calculation #3

Open snvv opened 1 year ago

snvv commented 1 year ago

Hello and thank you for your excellent package May I ask how we can get [rbind_rakelist] using parallel calculation. I tried the example in https://crew102.github.io/slowraker/articles/rapidraker.html but it returns a list and not a "list" "rakelist". Thank you in advance

crew102 commented 1 year ago

Hey @snvv , there would be a few ways to do this, but the easiest would probably be to just include the call to rbind_rakelist inside the parallel looping construct, then call do.call(rbind, x) on the result, x. Here's an example using sequential compute instead of parallel.

library(slowraker)
library(rapidraker)

data("dog_pubs")

x <- lapply(1:2, function(i) 
  rbind_rakelist(rapidrake(txt = dog_pubs$abstract[1:5]))
)
#>   |                                                                              |                                                                      |   0%  |                                                                              |==============                                                        |  20%  |                                                                              |============================                                          |  40%  |                                                                              |==========================================                            |  60%  |                                                                              |========================================================              |  80%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |==============                                                        |  20%  |                                                                              |============================                                          |  40%  |                                                                              |==========================================                            |  60%  |                                                                              |========================================================              |  80%  |                                                                              |======================================================================| 100%

head(do.call(rbind, x))
#>   doc_id                            keyword freq     score
#> 1      1 assistance dog identification tags    1 10.833334
#> 2      1          animal control facilities    1  9.000000
#> 3      1          emotional support animals    1  9.000000
#> 4      1                   small body sizes    1  9.000000
#> 5      1       seemingly inappropriate dogs    1  7.916667
#> 6      1            assistance dogs sharply    1  7.333333
#>                     stem
#> 1 assist dog identif tag
#> 2     anim control facil
#> 3      emot support anim
#> 4        small bodi size
#> 5    seem inappropri dog
#> 6     assist dog sharpli

Created on 2022-12-10 by the reprex package (v2.0.1)