Open darxriggs opened 2 years ago
I think the features in jenkins-pipeline-cache-plugin are more flexible than the ones in jobcacher-plugin:
key
and restoreKeys
) and hashFiles()
instead of a single file (cacheValidityDecidingFile
) and a branch (defaultBranch
) as cache fallbackBut jobcacher-plugin already has extension points that could be used to implement different caches. The question here is, if a user should be presented with so many options to choose from.
@darxriggs thanks for initiating this discussion. I am definitely interested into this. @j3t is there a reason why you created your own plugin instead of using the jobcacher plugin?
If I remember correctly, the jobcacher plugin was not maintained actively and it was not solving our problems (no MinIO support, no fallback cache, no support for maven multi Module projects). It is not intuitive, complicated and less flexible.
I see your points. We had some similar issues, that's the reason why I adopted the plugin.
Are you interested in merging the plugins and join the Jenkins community? If yes, I can try to make a proposal how the merged plugin could look like.
In general I think it's a good idea, but I'm also a little bit sceptical how this could look like. Lets give it try. Maybe we should sharpen the expectations/requirements first.
How would the following example look like with the jobcacher plugin after the migration?
node {
git (url: 'https://github.com/spring-projects/spring-petclinic', branch: 'main')
cache (path: "$HOME/.m2/repository", key: "petclinic-${hashFiles('**/pom.xml')}", restoreKeys: ['petclinic-', 'maven-']) {
sh './mvnw package'
}
}
I would expect only one of the following outcomes:
The test coverage should be at least on the same level as it is now.
I could imagine that we introduce your way of handling caching first as an alternative to the plugin users without changing the actual behavior of your cache step implementation. I think we should refactor your step so it uses the same item storage abstraction, as not all users have access to MinIO or AWS.
On the long run, I think having an behavior and API similar to the Github Action is better as a lot of users are already familiar with this. As Jenkins plugins are normally backward-compatible, we need to think about that. Maybe we can keep the old step API but redirect internally to the new implementation.
These are just some thoughts, so there is probably more. If you want, we can have chat about this and discuss this in a little more detail. What do you think (I am located in Switzerland btw.)?
I think we should refactor your step so it uses the same item storage abstraction, as not all users have access to MinIO or AWS.
Agree but the item storage is very generic and strict. As far as I know the archive must be created first and can be uploaded to S3 afterwards. The jenkins-pipeline-cache-plugin creates and uploads the archive in one single step (see BackupCallable.java). Also, when a cache gets stored or restored then the last access timestamp gets updated which is important when it comes to the eviction (see RestoreCallable.java).
As Jenkins plugins are normally backward-compatible, we need to think about that. Maybe we can keep the old step API but redirect internally to the new implementation.
I guess we could just extend the existing CacheStep.java so that it supports both Step APIs.
From the usage perspective the new cache implementation could look like this:
node {
git (url: 'https://github.com/spring-projects/spring-petclinic', branch: 'main')
cache (maxCacheSize: 250, caches: [[$class: 'FooCache', path: "$HOME/.m2/repository", key: "petclinic-${hashFiles('**/pom.xml')}", restoreKeys: ['petclinic-', 'maven-']]]) {
sh './mvnw package'
}
}
So the already existing ArbitraryFileCache
could be kept for backward compatibility.
It has to be decided, how the defaultBranch
option should be treated. It is optional, so not a very big problem for backward compatibility. But it might better be converted to an ArbitraryFileCache
option and not an overall option. If kept as is, in case only FooCache
is used as cache, a validation could be implemented to tell the user that the defaultBranch
option is not supported for this cache. This is just a rough sketch. The final API should be further improved.
With the possibility to have different caches, #9 (specific maven cache) could also be implemented.
When the assumption is that an API similar to the Github Action is the way to go anyway, then I would argue that maxCacheSize
and caches
should be optional as well. The plugin could then decide, depending on the provided configuration, which implementaion is used and it could also throw an error if the configuration is not valid. This way, we are still backward compatible and we also have a clean and user friendly API.
When the assumption is that an API similar to the Github Action is the way to go anyway, then I would argue that
maxCacheSize
andcaches
should be optional as well. The plugin could then decide, depending on the provided configuration, which implementaion is used and it could also throw an error if the configuration is not valid. This way, we are still backward compatible and we also have a clean and user friendly API.
Yes, I prefer this way too
I'd like to discuss the maximum cache size. We should decide if it should be configurable on the global Jenkins scope or per job.
It should take into consideration that Jenkins is used in small setups as well as big setups where individual persons might not have the permissions to configure all options.
I think there are two roles involved. An administrator that provides the cache and a developer that uses the cache. There is only a single cache storage available per Jenkins instance which is used for every job. The administrator has to provide the storage space (which might be "infinite" for certain cloud providers).
Therefore the administrator should be in control of the maximum available cache storage which means it has to be configured on the Jenkins scope. If it would be configurable per job, the administrator had no control of how much space would be used for the complete Jenkins instance and it could run out of space.
What's you opinion on this?
Another question: From the discussion so far my impression is that we want to go for the jobcacher
plugin and not the jenkins-pipeline-cache
plugin and migrate some functionality / API over there. For me this looks like the way to go because this plugin is already released, has some users and supports more job types. The downside is that we have to deal with the current API.
Is this what you have in mind too?
Regarding the cache size: I agree that an administrator has some interest to control the space caching requires, but I think there is a reason to configure it on a job level too, because if you do not restrict the size and you are increasing the cache size every build, the time needed to cache/restore will increase too.
And regarding the repo to use in the future, I would go with the jobcacher too.
I'd like to discuss the maximum cache size. We should decide if it should be configurable on the global Jenkins scope or per job.
We have to introduce a global configuration parameter for the jenkins-pipeline-cache
anyway, so I guess this parameter could also be used by the existing job types to implement a global limit (which is defined by the admin). The limit on the job level, must also be respected (backward-compatibility) but it could be optional.
From the discussion so far my impression is that we want to go for the jobcacher plugin and not the jenkins-pipeline-cache plugin and migrate some functionality / API over there. For me this looks like the way to go because this plugin is already released, has some users and supports more job types. The downside is that we have to deal with the current API.
In my opinion we have to be backward-compatible so that both user groups are happy, which is the case when we make maxCacheSize
and caches
optional and let the plugin decide. What is you opinion about my my proposal (see https://github.com/j3t/jenkins-pipeline-cache-plugin/issues/10#issuecomment-1171609106)?
And regarding the repo to use in the future, I would go with the jobcacher too.
Fully agree
@j3t Yes, we can try go this way, but we have to see whether this can be done in a clean way. If not, we could introduce a completely new step/wrapper (e.g. arbitraryFileCache(...)
) and reuse some parts of this in the old implementation. This would allow us to have a clear cut between the old and the new world and I think if we would like to introduce the Maven cache, we would like to use something like mavenCache(...)
anyway.
There is another caching plugin at https://github.com/jenkinsci/jobcacher-plugin which was created some years ago and got a new maintainer. It currently performs caching differently than this plugin here but for example also supports Freestyle jobs and storage on the Jenkins controller.
Maybe both plugins and functionality could be merged.
What do you think @j3t and @repolevedavaj?