Closed chainsawriot closed 1 year ago
deb
)output
(For extremely deep reproducibility exercise)output
as a dependency-free R installation scriptUse the following as a test case
trackpoint_date: 2020-01-16 packages: openNLP, LDAvis, topicmodels, quanteda
graph <- resolve(pkgs = c("openNLP", "LDAvis", "topicmodels", "quanteda"), snapshot_date = "2020-01-16")
openNLP needs rJava
topicmodels needs gsl
LDAvis has not been updated since 2015
quanteda in 2020 was pre3 and was monolithic (still with e.g. textplot_wordcloud
).
A slightly easier one is:
graph <- resolve("quanteda", snapshot_date = "2018-10-15")
The following is the code in the JOSS paper (which is not reproducible now).
library("quanteda")
# construct the feature co-occurrence matrix
examplefcm <-
tokens(data_corpus_irishbudget2010, remove_punct = TRUE) %>%
tokens_tolower() %>%
tokens_remove(stopwords("english"), padding = FALSE) %>%
fcm(context = "window", window = 5, tri = FALSE)
# choose 30 most frequency features
topfeats <- names(topfeatures(examplefcm, 30))
# select the top 30 features only, plot the network
set.seed(100)
textplot_network(fcm_select(examplefcm, topfeats), min_freq = 0.8)
rJava
has hidden system requirements: liblzma-dev libpcre3-dev libbz2-dev
With rJava hidden sysreqs added:
R version 3.6.2 (2019-12-12)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 10 (buster)
Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.3.5.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=C
[7] LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8
[9] LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] LDAvis_0.3.2 topicmodels_0.2-9 quanteda_1.5.2 openNLP_0.2-7
loaded via a namespace (and not attached):
[1] NLP_0.2-0 Rcpp_1.0.3 pillar_1.4.3
[4] compiler_3.6.2 tools_3.6.2 stopwords_1.0
[7] lubridate_1.7.4 lifecycle_0.1.0 tibble_2.1.3
[10] gtable_0.3.0 lattice_0.20-38 pkgconfig_2.0.3
[13] rlang_0.4.2 Matrix_1.2-18 fastmatch_1.1-0
[16] parallel_3.6.2 openNLPdata_1.5.3-4 rJava_0.9-11
[19] stringr_1.4.0 xml2_1.2.2 stats4_3.6.2
[22] grid_3.6.2 data.table_1.12.8 R6_2.4.1
[25] ggplot2_3.2.1 spacyr_1.2 magrittr_1.5
[28] scales_1.1.0 modeltools_0.2-22 colorspace_1.4-1
[31] stringi_1.4.5 RcppParallel_4.4.4 lazyeval_0.2.2
[34] munsell_0.5.0 slam_0.1-47 tm_0.7-7
[37] crayon_1.3.4
>
Download the all source pkg to a temp dir; including the original package
Install all terminal nodes from source
1: for each package in
output$dep
, check (installed.packages()
) if all dependencies have been installed; if yes; install it from source all deps are installed; goto 2 goto 12: install the original package