PowerShell / DSC

This repo is for the DSC v3 project
MIT License
201 stars 29 forks source link

Define contractual API for caching #544

Open michaeltlombardi opened 3 weeks ago

michaeltlombardi commented 3 weeks ago

Summary of the new feature / enhancement

As a resource author, I want to be able to rely on a contract between my resource and DSC for any caching my resource should do for performance, so my resources can participate in the lifecycle of DSC invocations reliably and transparently.

As an infrastructure engineer, I want to be able to schedule DSC to perform caching operations on a regular cadence before applying my configurations to reduce their processing time.

Right now, the PowerShell adapter resource performs caching to speed up DSC invocations, but it's a behavior of the adapter, not DSC itself (as far as I can tell). Users can't run a dsc cache or dsc resource cache command to explicitly invoke caching behaviors for resources that support it. Nor do users have any way to understand which resources on their system might be performing caching except to invesigate every resource manually.

Caching is also important for higher order tools, which may want to incorporate DSC caching into the rest of their workflow for reporting and validation and for performance reasons.

Proposed technical implementation details (optional)

I propose that we initially:

  1. Define a cache command/capability for resources, reusing the structure of other resource commands. Initially, this could be parameterless.
  2. Define a cache step in the DSC command lifecycle, possibly after discovering all resources but before enumerating adapted resources.
  3. Add a cache command to DSC, allowing users to explicitly invoke the caching resources. In the first iteration, I think it's okay if the cache command calls every resource with that capability in discovery order.

After this first iteration, it's probably worth considering whether and how to pass options to the caching resources. I considered a new kind of resource and a group resource for caching, but that makes the behavior configuration-dependent, where I think it makes sense to invoke caching for other operations and regardless of whether a document specifies caching resources.

I think it might still be doable to define caching-specific resources, but it would require any resource using caching to define two resources - one for the actual resource and another for the caching options (which would be global, rather than per-instance).

Once we have caching as part of the API, this could enable users to pass an option to always check the cache each run or skip cache checking, which could be useful when repeatedly invoking DSC interactively during investigations or testing.

SteveL-MSFT commented 2 weeks ago

For now, we can define a way in the manifest to advertise caching.

SteveL-MSFT commented 2 weeks ago

Thinking about this, if the goal is pre-populate the cache, I think just running dsc resource list '*' --adapter '*' I think forces discovery and should pre-populate any caches

michaeltlombardi commented 1 week ago

That would work for adapter resources, but not for other resources that may benefit from a cache, like pre-enumerating IIS sites. It also wouldn't give users or higher order tools any insights into whether/how the resource is creating or using caches, or the ability to force a clear of the cache.

Right now, the only resources that use caching are adapter resources, but there's plenty of cases where parsing every possible instance of a resource or retrieving information to use across instances in a run would give performance benefits for long, complicated configuration documents or process-intensive resources. This is especially true for APIs that require you to enumerate their full list, which I encountered several times at Puppet.

I definitely think we want, at a minimum, a way for a resource to indicate that it is using a cache (and where the cache gets stored) for the sake of users and integrating developers.