Can these three R packages be installed please?

kurttaylor commented 2 years ago

I am producing a data-driven flow chart using the DiagrammeR package. Can this please be installed? I also require DiagrammeRsvg and rsvg to allow me to successfully export and save the flowchart as a PNG. Thank you!

bloodearnest commented 2 years ago

This is proving problematic to install. It tries to upgrade htmltools, which fails to use fastmap, possibly because it's older.

@wjchulme @milanwiedemann any thoughts or workarounds?


** package 'htmltools' successfully unpacked and MD5 sums checked
** using staged installation
** libs
make[1]: Entering directory '/tmp/Rtmp9vorb0/R.INSTALL6ef2ed6f4f0/htmltools/src'
gcc -std=gnu99 -I"/usr/share/R/include" -DNDEBUG      -fpic  -g -O2 -fdebug-prefix-map=/build/r-base-ttHamR/r-base-4.0.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c init.c -o init.o
gcc -std=gnu99 -I"/usr/share/R/include" -DNDEBUG      -fpic  -g -O2 -fdebug-prefix-map=/build/r-base-ttHamR/r-base-4.0.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c template.c -o template.o
gcc -std=gnu99 -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -Wl,-z,relro -o htmltools.so init.o template.o -L/usr/lib/R/lib -lR
make[1]: Leaving directory '/tmp/Rtmp9vorb0/R.INSTALL6ef2ed6f4f0/htmltools/src'
installing to /usr/local/lib/R/site-library/00LOCK-htmltools/00new/htmltools/libs
** R
** byte-compile and prepare package for lazy loading
Error: object 'faststack' is not exported by 'namespace:fastmap'
Backtrace:
1: stop(sprintf(ngettext(length(miss), "object %s is not exported by 'namespace:%s'", 
2: importIntoEnv(impenv, impnames, ns, impvars)
3: namespaceImportFrom(ns, loadNamespace(j <- i[[1L]], c(lib.loc, 
4: loadNamespace(package = package, lib.loc = lib.loc, keep.source = keep.source, 
5: withCallingHandlers(expr, packageStartupMessage = function(c) tryInvokeRestart("muffleMessage"))
6: suppressPackageStartupMessages(loadNamespace(package = package, 
7: code2LazyLoadDB(package, lib.loc = lib.loc, keep.source = keep.source, 
8: tools:::makeLazyLoading("htmltools", "/usr/local/lib/R/site-library/00LOCK-htmltools/00new", 
ERROR: lazy loading failed for package 'htmltools'
* removing '/usr/local/lib/R/site-library/htmltools'
* restoring previous '/usr/local/lib/R/site-library/htmltools'
cat: visNetwork.out: No such file or directory
cat: DiagrammeR.out: No such file or directory

The downloaded source packages are in
    '/tmp/Rtmponn1Ju/downloaded_packages'
Warning message:
In install.packages("DiagrammeR", Ncpus = 8) :
  installation of one or more packages failed,
  probably 'htmltools', 'visNetwork', 'DiagrammeR'```

remlapmot commented 2 years ago

What about trying to install those packages from a CRAN snapshot at around the date your version of htmltools was released (as discussed a bit at the end of #75).

Your htmltools is at version 0.5.0, it's CRAN archive

https://cran.r-project.org/src/contrib/Archive/htmltools/

shows that its version 0.5.0 was released on 2020-06-16 and 0.5.1 released 2021-01-12.

The closest dated RPSM snapshot after 2020-06-16 is on 2020-06-18, so the code would be (unfortunately these in source form, so you'll need system dependencies I think)

install.packages(c("DiagrammeR", "DiagrammeRsvg", "rsvg"), repos = "https://packagemanager.rstudio.com/all/2020-06-18+Y3JhbiwyOjI4Nzs3QkQ1REYwNw")

or from MRAN would be

install.packages(c("DiagrammeR", "DiagrammeRsvg", "rsvg"), repos = "https://mran.microsoft.com/snapshot/2020-06-17")

You might need to play around with the date a bit.

kurttaylor commented 2 years ago

Thanks for checking - as a workaround for now - I have just saved my list as a CSV which I can then request out as an output and then create the actual figure outside of OS using the CSV.

wjchulme commented 2 years ago

Thanks, @kurttaylor. This is a workflow which we're starting to encourage more because it reduces server use and makes it easier to make incremental changes to aesthetic and other "post-data" elements of the analysis. Though it does prevent these being part of the project.yaml which for some might be a disadvantage. If you have any thoughts about this we'd be keen to hear them :)

remlapmot commented 2 years ago

Some possible suggestions

(You've probably already thought of this) Could you create a flag to an action that indicates that an action should not be run on your backend servers and only locally. Then all actions can still be in the project.yaml, i.e., could be say run_on_backend_sever: false (true by default)
```
actions:
  action_name:
    run_on_backend_server: false
```
and maybe this flag could accept a list in the future when running both TPP and EMIS say ['tpp', 'emis'].

And/or maybe a flag that is a list of actions to run locally after the output has been obtained from the server, e.g.,
```
actions:
  ... 
post_data_actions: [make-flowchart, make-some-other-plot]
```
To reduce load on the backend servers one possible suggestion is to fit some of the models which are taking the most runtime and/or memory in another faster language, e.g., I would guess that Julia (https://julialang.org/), because of its JIT compiler, fits Cox models and GLMs substantially faster than R. So you'd need to add Julia into this container or make a separate Julia container.
- I'd suggest including Julia in this container as there are at least 2 good R packages on CRAN which allow R to interface with Julia; JuliaCall and JuliaConnectoR.
- Note the timings on the Julia website are a bit misleadingly fast for Julia because they usually only report runtime, and of course in reality you'd experience compile time and run time, so you wouldn't see the full speed ups they report.
- Admittedly this would be alot of hassle to setup, and Julia errors are harder to debug than R errors ...
I think some users are including implausibly large numbers of potential confounders in their models.
- These covariates won't be independent of each other, so under Pearl's directional separation rules for directed acyclic graphs any confounding backdoor pathways are blocked by adjustment for a minimal sufficient adjustment set, i.e., probably a small subset of the covariates.
- Of course, it's difficult to know what those minimal sufficient adjustment sets are. But if you can draw the DAG for a model then dagitty can tell you those minimal sufficient adjustment sets http://dagitty.net/ , http://dagitty.net/dags.html (see top righthand pane, or adjustmentSets() function in its R package).

wjchulme commented 2 years ago

create a flag to an action that indicates that an action should not be run on your backend servers and only locally

This is a good suggestion, but:

it only works if the opensafely R container include the required packages, which isn't the case for the OP problem.
the release process for outputs now sends outputs to job server, not directly to the remote repo. So currently these would need to be manually added to then be re-used in subsequent locally-run actions. But this might change -- @bloodearnest knows more about the longer-term plan for outputs.

wjchulme commented 2 years ago

fit some of the models which are taking the most runtime and/or memory in another faster language

I would love for there to be some work on improving efficiency of model fitting and other processes, whether that's in another language or getting the most out of R and stata. Certainly some features are underused -- parallelisation, shrinking in-memory objects, data.table, etc.

opensafely-core / r-docker

Can these three R packages be installed please? #76