r-lib / pkgdepends

R Package Dependency Resolution
https://r-lib.github.io/pkgdepends/
Other
102 stars 30 forks source link

WIP: Proof of concept implementation for local package caching #268

Open tzakharko opened 2 years ago

tzakharko commented 2 years ago

This is a proof of concept implementation for #261

The basic idea is to leverage the fact that the metadata of installed packages is already cached. So we don't need to do much — just inject some metadata describing the contents of the local repository (in this initial draft it's the local path + the file checksums) and do nothing if they match the already installed package. Seems to work well enough in local tests, but I cannot exclude that I missed something crucial — the package logic is fairly complex and I barely scraped the surface of the iceberg in trying to understand it.

The checksumming currently relies on tools::md5sum() and simply concatenates the hash for all files (just to show that it works), but we probably want to use digest here for proper sha256 (this would mean hard dependency on digest however). Otherwise we need to find a way to mix the md5 hashes somehow without introducing new package dependencies. Regarding performance: checksumming does introduce some overhead, but making a package is still much much slower, so if you local packages do not change too often, amortized savings over time can be substantial.

Please tell me what you think and I'll do my best to transition it to something more production ready.

tzakharko commented 2 years ago

Not sure why the pipeline is failing, there are some references to invalid GitHub credentials etc... all tests pass on my local machine (macOS 12.1 with R 4.1)

tzakharko commented 2 years ago

@gaborcsardi Did you by any chance have the time to look at the patch and see whether this approach is in principle sufficient?

gaborcsardi commented 2 years ago

I am sorry for the long wait. I am working on pkgdepends again, and will look at this soon.