A cloud native file cache for Jenkins pipelines. The files are stored in a S3-Bucket. The functionality is very similar to the one provided by GitHub Actions.
The primary goal is to have a file cache for so called hot agent nodes
. Those nodes are started on demand when an execution is scheduled by Jenkins and killed after the execution is finished (e.g. by using the kubernetes-plugin or nomad-plugin). This is fine but has also some drawbacks and some of them can be solved by having a file cache in place (e.g. to cache build dependencies or statistic data for code analysis or whatever data you want to be present for the next build execution).
Manage Jenkins -> Manage Plugins -> Advanced -> Upload Plugin
For automated installations via plugin.txt
you can use an entry like below:
jenkins-pipeline-cache::https://github.com/j3t/jenkins-pipeline-cache-plugin/releases/download/0.2.0/jenkins-pipeline-cache-0.2.0.hpi
Manage Jenkins -> Configure System -> Cache Plugin
Username
(aka S3-Access-Key)Password
(aka S3-Secret-Key)Bucket
Region
Test connection
The plugin requires the following permissions in S3 for the bucket:
Below you can find an example where the local maven repository of the spring-petclinic project is cached.
node {
git(url: 'https://github.com/spring-projects/spring-petclinic', branch: 'main')
cache(path: "$HOME/.m2/repository", key: "petclinic-${hashFiles('**/pom.xml')}") {
sh './mvnw package'
}
}
The path
parameter points to the local maven repository and the key
parameter is the hash sum of all maven poms, prefixed by a dash and the project name.
The hashFiles
method is optional but can be helpful to generate more precise keys. The idea is to collect all files which have impact to the cache and then create a hash sum from them (e.g. hashFiles('**/pom.xml')
creates one hash sum over all maven poms in the workspace).
If the job gets executed, the plugin tries to restore the maven repository from the cache by using the given key
. Then the inner-step gets executed and if this was successful and the cache doesn't exist yet then the path
gets cached.
Below you can find a complete list of the cache
step parameters:
Name | Required | Description | Default | Example |
---|---|---|---|---|
path | x | Path to the directory which we want to be cached (absolute or relative to the workspace) | $HOME/.m2/repository - cache the local maven repository |
|
key | x | Identifier which is assigned to the cache. | maven-4f98f59e877ecb84ff75ef0fab45bac5 |
|
restoreKeys | Additional keys which are used when the cache gets restored. The plugin tries to resolve them in the defined order (key first then the restoreKeys ) and in case this was not successful then the latest key with the same prefix gets restored. |
['maven-', 'petclinic-'] - restore the latest cache where the key starts with maven- or petclinic- if the key not exists |
||
includes | Ant-style pattern applied to the path to filter the files which are included. |
**/* - includes all files |
**/*.xml or **/*.xml,**/*.html see here for more details |
|
excludes | Ant-style pattern applied to the path to filter the files which are excluded. |
Excludes no files | see includes |
Any S3 compatible storage provider should work. MinIO is supported first class, because all the integration tests are executed against MinIO.
In order to use an alternative provider, you probably have to change the Endpoint
parameter.
Manage Jenkins -> Configure System -> Cache Plugin
Endpoint
parameterTest connection
You can define a threshold in megabyte if you want to limit the total cache size. If the value is > 0 then the plugin checks every hour the threshold and removes last recently used items from the cache as long as the total cache size is smaller than the threshold again (LRU).
Manage Jenkins -> Configure System -> Cache Plugin
Threshold
parameterAnyone which can create/execute build jobs has basically also access to all caches. The 'attacker' just needs a way to execute the plugin, and they need to know the key which is assigned to a particular cache. There is no list available where all the keys are listed but the build logs contain them. The plugin guarantees that the same key is not created twice and also that an existing key is not replaced, but it not guarantees that a restored cache was not manipulated by someone else which has access to the S3 bucket for example.
As a general advice, sensitive data or data which cannot be restored from somewhere else or not regenerated should not be stored in caches. It should also not a big deal, besides that the build takes longer, if a cache has been deleted (e.g. by accident, by the cleanup task, by a data crash or ...).
hashFiles
step expects an Ant-Style pattern relative to the workspace as parameterincludes/excludes
parameter must be an Ant-Style pattern relative to the path
key
already exists or the inner-step has been failed (e.g. unit-test failures)cache
pipeline stephashFiles
pipeline step