gesistsa / rang

🐶 (Re)constructing R computational environments
https://gesistsa.github.io/rang/
GNU General Public License v3.0
77 stars 3 forks source link

Installation proposal #2

Closed chainsawriot closed 1 year ago

chainsawriot commented 1 year ago

Download the all source pkg to a temp dir; including the original package

paste0("https://cran.r-project.org/src/contrib/Archive/", x, "/", x, "_", x_version, ".tar.gz")

Install all terminal nodes from source

install.packages(path_to_file, repos = NULL, type="source")

1: for each package in output$dep, check (installed.packages()) if all dependencies have been installed; if yes; install it from source all deps are installed; goto 2 goto 1

2: install the original package

chainsawriot commented 1 year ago
  1. Listing the latest R version as of snapshot date #6 (for finding appropriate ROCKER base image)
  2. Listing all system requirement #5 (for installing system requirement. For ROCKER, must be either Debian or Ubuntu, so list out all deb)
  3. Optional: Download all the source packages as per output (For extremely deep reproducibility exercise)
  4. Export output as a dependency-free R installation script
chainsawriot commented 1 year ago

Use the following as a test case

trackpoint_date: 2020-01-16 packages: openNLP, LDAvis, topicmodels, quanteda


graph <- resolve(pkgs = c("openNLP", "LDAvis", "topicmodels", "quanteda"), snapshot_date = "2020-01-16")

openNLP needs rJava topicmodels needs gsl LDAvis has not been updated since 2015 quanteda in 2020 was pre3 and was monolithic (still with e.g. textplot_wordcloud).

chainsawriot commented 1 year ago

A slightly easier one is:

graph <- resolve("quanteda", snapshot_date = "2018-10-15")

The following is the code in the JOSS paper (which is not reproducible now).

library("quanteda")

# construct the feature co-occurrence matrix
examplefcm <-
    tokens(data_corpus_irishbudget2010, remove_punct = TRUE) %>%
    tokens_tolower() %>%
    tokens_remove(stopwords("english"), padding = FALSE) %>%
    fcm(context = "window", window = 5, tri = FALSE)

# choose 30 most frequency features
topfeats <- names(topfeatures(examplefcm, 30))

# select the top 30 features only, plot the network
set.seed(100)
textplot_network(fcm_select(examplefcm, topfeats), min_freq = 0.8)
chainsawriot commented 1 year ago

rJava has hidden system requirements: liblzma-dev libpcre3-dev libbz2-dev

chainsawriot commented 1 year ago

With rJava hidden sysreqs added:

R version 3.6.2 (2019-12-12)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 10 (buster)

Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.3.5.so

locale:
 [1] LC_CTYPE=en_US.UTF-8          LC_NUMERIC=C                 
 [3] LC_TIME=en_US.UTF-8           LC_COLLATE=en_US.UTF-8       
 [5] LC_MONETARY=en_US.UTF-8       LC_MESSAGES=C                
 [7] LC_PAPER=en_US.UTF-8          LC_NAME=en_US.UTF-8          
 [9] LC_ADDRESS=en_US.UTF-8        LC_TELEPHONE=en_US.UTF-8     
[11] LC_MEASUREMENT=en_US.UTF-8    LC_IDENTIFICATION=en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] LDAvis_0.3.2      topicmodels_0.2-9 quanteda_1.5.2    openNLP_0.2-7    

loaded via a namespace (and not attached):
 [1] NLP_0.2-0           Rcpp_1.0.3          pillar_1.4.3       
 [4] compiler_3.6.2      tools_3.6.2         stopwords_1.0      
 [7] lubridate_1.7.4     lifecycle_0.1.0     tibble_2.1.3       
[10] gtable_0.3.0        lattice_0.20-38     pkgconfig_2.0.3    
[13] rlang_0.4.2         Matrix_1.2-18       fastmatch_1.1-0    
[16] parallel_3.6.2      openNLPdata_1.5.3-4 rJava_0.9-11       
[19] stringr_1.4.0       xml2_1.2.2          stats4_3.6.2       
[22] grid_3.6.2          data.table_1.12.8   R6_2.4.1           
[25] ggplot2_3.2.1       spacyr_1.2          magrittr_1.5       
[28] scales_1.1.0        modeltools_0.2-22   colorspace_1.4-1   
[31] stringi_1.4.5       RcppParallel_4.4.4  lazyeval_0.2.2     
[34] munsell_0.5.0       slam_0.1-47         tm_0.7-7           
[37] crayon_1.3.4       
>