PolMine / polmineR

R-package for text mining with the Corpus Workbench (CWB) as backend
48 stars 9 forks source link

Examples with CPU time > 2.5 times elapsed time #274

Closed ablaette closed 6 months ago

ablaette commented 10 months ago

The Debian check server for incoming packages reports:

Examples with CPU time > 2.5 times elapsed time

user system elapsed ratio
polmineR-package 1.675 0.044 0.108 15.917
ngrams 1.841 0.037 0.118 15.915
features 2.657 0.087 0.184 14.913
ll 1.722 0.037 0.119 14.782
corpus-class 2.466 0.077 0.173 14.699
size-method 1.982 0.031 0.138 14.587
hits 3.391 0.085 0.257 13.525
context-method 1.961 0.048 0.153 13.131
kwic 4.461 0.092 0.361 12.612
cooccurrences 2.696 0.065 0.222 12.437
textstat-class 2.431 0.048 0.212 11.693
bundle 3.099 0.132 0.278 11.622
kwic-class 2.206 0.040 0.200 11.230
context-class 2.197 0.107 0.268 8.597
count-method 1.844 0.054 0.222 8.550
html-method 2.224 0.068 0.332 6.904
count_class 1.085 0.015 0.164 6.707
dispersion-method 1.227 0.020 0.209 5.967
subcorpus_bundle 1.333 0.019 0.388 3.485
subset 2.664 0.100 1.046 2.642

This indicates heavy parallelization, but CRAN prohibits using more than 2 cores by default.

There is a debate on the R-package-devel mailing list: https://stat.ethz.ch/pipermail/r-package-devel/2022q4/008610.html https://stat.ethz.ch/pipermail/r-package-devel/2023q3/009357.html https://stat.ethz.ch/pipermail/r-package-devel/2023q3/009471.html

ablaette commented 6 months ago

To conform to CRAN policies, polmineR is not throttled to 2 cores by default.