heroku / libcnb.rs

A framework for writing Cloud Native Buildpacks in Rust
BSD 3-Clause "New" or "Revised" License
35 stars 6 forks source link

Struct Layer API #814

Closed Malax closed 3 months ago

Malax commented 6 months ago

This PR adds a new API for working with CNB layers, attempting to solve the issues with the current trait based API. The current issues are mainly around flexibility and explicitness.

This PR does not remove the existing trait API. This allows buildpack maintainers to slowly migrate to the new API, even on a layer-by-layer basis. Eventually, the old trait API will be removed. For now, the trait API is just marked as deprecated.

Trait API problems

The current API encourages, and in some cases forces, buildpack implementors to put all code related to a layer into the Layer trait implementation. As the code will be called from libcnb itself, the buildpack has no control and/or insight when the functions will be called. This is especially a problem for logging.

Passing data into and from a Layer trait implementation is hard to do correctly and ergonomically. Before #669, it was impossible to use mutable values originating from outside the layer implementation because the trait was too restrictive. Coupled with the fact that most buildpack implementations chose to put business logic in Layer implementations made this restriction even worse.

The trait default implementation of existing_layer_strategy is a source of great confusion because the default is not an universally good default. More context can be found in #371.

New API philosophy

The new API only abstracts the minimum functionality and leaves code structure for dealing with the business logic up to the implementor. All a buildpack now does is asking BuildContext for a layer, providing two callbacks for cache invalidation and invalid metadata handling:

let layer_ref = context.cached_layer(
    layer_name!("test"),
    CachedLayerDefinition {
        build: true,
        launch: true,
        invalid_metadata: &|_| InvalidMetadataAction::DeleteLayer,
        inspect_restored: &|_: &GenericMetadata, _| InspectRestoredAction::KeepLayer,
    },
)?;

Afterwards, the LayerRef returned by cached_layer (there is also an even simpler uncached_layer) contains the path to the layer and detailed information about the layer state. The layer state can signal that the layer was restored from cache, that the layer is empty because the cache was invalidated by the inspect_restored callback, that the layer is empty because it didn't exist, etc. Both invalid_metadata and inspect_restored can also return a cause value that is available in the layer state. This is useful if the cache invalidation logic has multiple cases for invalidation and the buildpack needs to know why the cache was invalidated. This is specially useful for user output.

Fallible callback implementations are also possible by wrapping the return value of the callback in a Result with the buildpack error type as the Result's E. In fact, any type is possible to be used as a callback return value as long as it implements IntoAction. Most buildpacks will never have to do this though.

Since the callbacks are implemented as function references with a lifetime, it is possible to access values that are valid for the lifetime of the definition and to use regular functions that can be unit tested without the baggage of the actual layer handling:

fn inspect_restored_complex(metadata: &GenericMetadata, path: &Path) -> InspectRestoredAction {
    unimplemented!()
}

context.cached_layer(
    layer_name!("test"),
    CachedLayerDefinition {
        build: true,
        launch: true,
        invalid_metadata: &|_| InvalidMetadataAction::DeleteLayer,
        inspect_restored: &inspect_restored_complex,
    },
)?;

Since everything that comes after the cached_layer call is up to the buildpack author, the appropriate abstraction can be chosen on a case-by-case basis. A quick and dirty inline layer in main.rs:

let layer_ref = context.cached_layer(
    layer_name!("test"),
    CachedLayerDefinition {
        build: true,
        launch: true,
        invalid_metadata: &|_| InvalidMetadataAction::DeleteLayer,
        inspect_restored: &inspect_restored_complex,
    },
)?;

match layer_ref.state {
    LayerState::Restored { .. } => {
        // Work with existing data
    }
    LayerState::Empty { .. } => {
        // Populate the layer
    }
}

or a more complex layer with lots of associated business logic in its own module:

let jvm_layer_ref = context.cached_layer(
    layer_name!("jvm"),
    CachedLayerDefinition {
        build: true,
        launch: true,
        invalid_metadata: &|_| InvalidMetadataAction::DeleteLayer,
        inspect_restored: &crate::layers::jvm::inspect_restored_jvm::layer,
    },
)?;

crate::layers::jvm::handle_jvm_layer(&jvm_layer_ref, &openjdk_inventory, &mut logger)?;

Since LayerState is an easily pattern matched value, buildpack authors can freely choose which layer states to combine or even care about. None of that flexibility is achieved by libcnb.rs specific mechanisms - use your entire Rust toolbox and go wild!

The PR has more examples how this new API can be used in RustDocs, example and test buildpacks.

Refactor of the trait API implementation

This PR also refactors the implementation of the trait API. No changes to the actual API has been made though. The refactor was necessary to allow for code sharing between both APIs as the low-level operations on CNB layers are the same. As the code for those low-level operations has been thoroughly tested, it made sense to reuse it for the new API implementation as well.

Breaking Changes

This PR is marked as a breaking change. Even though the old trait API wasn't removed, the layer errors needed to be restructured to allow code-sharing between the old and new API. For the majority of the buildpack implementations, this will not be a breaking change in practice. The JVM buildpacks built cleanly with the code in this PR without any modifications.

Yet, as this is technically a breaking change, the PR is marked as such and the changelog contains a section of the breakage.

Follow ups

Review notes

This is a really dense (packed) PR. The code for the new API is highly generic and small details can make or break the usability of this API. It might not be obvious by just reading the implementation in a web browser how the pieces fit together and how the API is used. However, I am absolutely certain that it is easy to use.

Please take the time and look at the usage examples in BuildContext::cached_layer to get a feel for the API. I'd also suggest using this branch for an existing buildpack and port a layer over. This can be achieved by "patching" in Cargo.toml:

[patch.crates-io]
libcnb = { git = "https://github.com/heroku/libcnb.rs.git", branch = "malax/layer-api" }
libcnb-data = { git = "https://github.com/heroku/libcnb.rs.git", branch = "malax/layer-api" }

Should you have trouble getting started with the new API, please let me know - this is the most important feedback as I can myself no longer see this new API with fresh eyes!