Currently loading a module that isn’t already in cache parses and evaluates all its source files, which is potentially time-consuming, especially when compared to loading a package: installed R packages aren’t loaded from source. Instead they’re loaded from a lazy-load database.
‘box’ could maintain a secondary storage cache (unless disabled) that is queried before the source version of a module is loaded, unless the latter has a more recent timestamp. In that case, the cache would be invalidated, the source version loaded, and subsequently cached.
R doesn’t seem to provide a public API for generating lazy-load databases, but I don’t understand the purpose of lazy loading for exported names anyway — using RDS with a custom serialisation hook for package/module dependencies seems easier.
Lastly, keeping modules cached also means we can finally implement byte-compilation of modules without a prohibitive overhead on loading.
Some notes:
Cache path: box.cache (overridden by R_BOX_CACHE)
defaults to XDG_CACHE_HOME/R/%v/%p/box (placeholders as for R_LIBS_*) or equivalent
explicitly set to NULL to disable
Is a modification timestamp sufficient to establish cache validity or is a hash required?
Figure out how to customise serialisation of dependencies.
Terminology in API: term “cache” is now overloaded because we unfortunately already have the function purge_cache.
Cache module help as well?
What about integration of compiled native code?
Hook to run on “installation” of a module into the cache? (see #163)
14
Add exported function to explicit add/remove modules to/from cache (e.g. install/uninstall)?
Currently loading a module that isn’t already in cache parses and evaluates all its source files, which is potentially time-consuming, especially when compared to loading a package: installed R packages aren’t loaded from source. Instead they’re loaded from a lazy-load database.
‘box’ could maintain a secondary storage cache (unless disabled) that is queried before the source version of a module is loaded, unless the latter has a more recent timestamp. In that case, the cache would be invalidated, the source version loaded, and subsequently cached.
R doesn’t seem to provide a public API for generating lazy-load databases, but I don’t understand the purpose of lazy loading for exported names anyway — using RDS with a custom serialisation hook for package/module dependencies seems easier.
Lastly, keeping modules cached also means we can finally implement byte-compilation of modules without a prohibitive overhead on loading.
Some notes:
box.cache
(overridden byR_BOX_CACHE
)XDG_CACHE_HOME/R/%v/%p/box
(placeholders as forR_LIBS_*
) or equivalentNULL
to disablepurge_cache
.14
install
/uninstall
)?