Open hturner opened 1 year ago
We had a very good discussion with Uwe. Uwe and I are going to collaborate on implementing this feature. I will keep the issue updated. @bnaras can you put the notes you took during the meeting in a comment here?
installed.packages()
takes a long time to execute the first time in a session when a large number of packages are installed in a library. (The subsequent invocations are fast because of caching.)
The issue is acute in settings where library is shared via network mounted drives, as is not uncommon for educational labs etc. In Windows installations, even with < 100 packages, the function takes 2 seconds or more on a (reasonably powerful) machine as Uwe demonstrated. This is also a problem for an Rstudio user because, upon startup, Rstudio seeks to ascertain all installed packages making it unusable in a networked shared library setting.
The time it takes for installed.packages()
is dominated by the time to read every DESCRIPTION
file in all the installed packages.
Maintain an up-to-date database---we use the term loosely, for now---of installed packages so that the information is readily available for installed.packages()
to epxploit.
Desiderata are:
Ncpus
> 1). The parallel installation process already calculates dependencies and puts most important dependencies firstPACKAGES.RDS
that reflects what's actually installed.
As described in Uwe's talk in the kick-off session on Day 1: create a database for each library of installed R packages.
This can help to speed up functions that check which packages are installed.