jenkinsci / plugin-modernizer-tool

MIT License
9 stars 7 forks source link

Caching Metadata Retrieval for Repeated Runs #208

Open gounthar opened 2 months ago

gounthar commented 2 months ago

What feature do you want to see added?

Current Behavior

When launching the same command to fetch metadata for a set of plugins, the tool reports "Metadata is not yet computed for plugin git." even on the second execution.

Desired Behavior

I would like the tool to retrieve the metadata from a local file when running the same command repeatedly on the same machine for the same list of plugins.

Proposed Solution

Implement a caching mechanism for the metadata retrieval process to avoid redundant computation.

Benefits of Caching Metadata

  1. Improved Efficiency: Subsequent runs of the same command will be faster, as the metadata will be loaded from a local cache instead of being recomputed.
  2. Reduced Load on Remote Resources: Caching the metadata locally will minimize the need to fetch it from remote sources, reducing the load on external systems.
  3. Better User Experience: Users will not have to wait for the metadata to be computed again, making the tool more responsive and user-friendly.

Implementation Considerations

  1. Cache Storage: Determine the most appropriate location and format for storing the cached metadata (e.g., local file, in-memory cache).
  2. Cache Invalidation: Implement a strategy to invalidate the cache when necessary, such as when the plugin list changes or after a certain time period.
  3. Cache Retrieval: Modify the metadata retrieval logic to first check the local cache and only fetch from remote sources if the data is not available locally.

Next Steps

  1. Design and implement the caching mechanism for the metadata retrieval process.
  2. Test the new caching functionality thoroughly to ensure it works as expected.
  3. Document the caching behavior and any user-facing implications.

Your feedback and suggestions on this proposed improvement would be greatly appreciated.

Upstream changes

No response

Are you interested in contributing this feature?

No response

jonesbusy commented 2 months ago

Yea that's the correct behavior for now. There are some TODO in the code

// TODO: For now it's always null because we don't persist nor cache metadata
if (metadata == null) {
    LOG.info("Metadata is not yet computed for plugin {}. Using minimum JDK available", plugin.getName());
    jdk = JDK.min();
} else {
    jdk = plugin.getJDK();
}

The cache also as some hardcoded expiration of 1h which is way too short for most object.

Most likely need to be fixed also when we store those file on a remote location

jonesbusy commented 2 months ago

Implemented for local cache.

Open discussion for a repository on jenkins-infra https://github.com/jenkins-infra/helpdesk/issues/4262

jonesbusy commented 4 weeks ago

Other idea would be to use OCI storage

I started the development of https://github.com/jonesbusy/oras-java which should be moved to https://github.com/oras-project during October

Nowadays many tools are using OCI storage to store images, manifests etc (Helm, Flux CD, etc...)

Even updatecli is doing it https://www.updatecli.io/docs/commands/updatecli_manifest_push/

We could easily store such medata on the GitHub package of this repository

gounthar commented 3 weeks ago

Great idea!