ipfs-inactive / dev-team-enablement

[ARCHIVED] Dev Team Enablement Working Group
3 stars 1 forks source link

Cache node_modules for js projects #66

Open victorb opened 6 years ago

victorb commented 6 years ago

Lots of time is being spent installing node_modules. They should already be cached on the build machines themselves, so the download time is not a problem, however modules needs to be built and moved, which is probably taking time.

We can make this faster by caching them on the master. Idea is to have some lightweight service that keeps a map of sha1 of package.json/yarn.lock and then a IPFS hash for a full node_modules directory. Clients/workers can then check with the service if it exists, and download them directory. If not, the build would add them to IPFS and pin it on the master for the next builds.

From @travisperson

I think this is a neat idea. I took a look at some of the latest install times for the js-ipfs (which is probably one of the larger dependency trees).

Numbers pulled from (https://ci.ipfs.team/blue/organizations/jenkins/IPFS%2Fjs-ipfs/detail/master/54/pipeline/18)

Worker Time
linux - 8.9.1 34s
linux - 9.2.0 56s
macos - 8.9.1 63s
macos - 9.2.0 100s
windows - 8.9.1 64s
windows - 9.2.0 57s

On my workstation, installing the modules with yarn takes ~ 30s. I added the folder to IPFS, remove the node_modules/ directory and ran an ipfs get. The operation takes ~ 4s, so quite a bit faster.

Moved from https://github.com/ipfs/jenkins-libs/issues/7

victorb commented 5 years ago

Better idea: master will only have a key-value store where key is "os + nodejs version + sha1 of package.json" and value is the hash of a tarball containing the full node_modules directory. Then we need three Groovy functions that can check if a cache exists, store a cache and fetch a cache. Each worker will run a go-ipfs node, and the cache will be local to each worker but workers connected with each others.

Would be much better, especially for macOS workers, as they are not as fast connected to the master as linux and windows hosts. Would also not put so much load on the master.