buildpacks / libcnb

A non-opinionated language binding for the Cloud Native Buildpack Buildpack and Extension specifications
Apache License 2.0
31 stars 13 forks source link

Adds layer.Reset function #141

Closed ForestEckhardt closed 2 years ago

ForestEckhardt commented 2 years ago

The layer.Reset() function is some functionality that exists in (packit/v2 library)[https://github.com/paketo-buildpacks/packit/blob/7b9a5b045761a8d224580cdb904e5464fa0e8f8b/layer.go#L95-L117]. It allows a buildpack author to ensure that there is no data left over from a cached layer. I felt that it would be appropriate to add this functionality over here as well for authors who want to guarantee that their layers are empty.

Signed-off-by: Forest Eckhardt feckhardt@pivotal.io

dmikusa commented 2 years ago

Generally 👍 on this.

Two questions.

  1. Do you think it would be worthwhile to have either an argument or a similar function that only clears out the contents of the layer? i.e. just remove/recreate the layer directory.

    I'd have to dig deeper to confirm, but I think there are cases in libpak where we are doing something similar.

  2. I'm wondering about what the consumption of layer.Reset() looks like in packit. What situations do you find yourself using this? Do you generally have a layer you're actively working with and then want to reset it? or are you creating a new layer & resetting it?

    Right now libcnb has layers.Layer(..) which creates a new layer that creates a layer and if it exists loads metadata. I could perhaps see expanding on that to allow creating an empty layer, or creating a layer and wiping any existing metadata/files.

    My initial thought was what is the difference between creating a new layer and resetting a layer, and I don't think much except the context in which that's being used by library consumers. If buildpack authors are having to create a new layer and reset it immediately, that's not great. If on the other hand the pattern is to create a layer, use it a bit, then perhaps reset it, I could see the reset method being handy. Or perhaps we could have both? Thoughts?

ForestEckhardt commented 2 years ago

So the general workflow that we have set up in most of our buildpacks that supply dependencies is that we will load a layer with layers.Get() we will then look at the metadata and see if we should reuse the layer. If the layer needs to be rebuilt because the cached layer does not line up we will then call layer.Reset() to clear the layer and make sure it is clean for the new dependency. A good example of this is go-dist build.go.

There are cases where we just want to load a layer and use it without doing any cache evaluation and then just add information to it. Typically we would see that type of layer for something like a build cache or a module cache. A good example is in go-build build.go. Here we create a goCacheLayer that we use to store all of the go build cache which we use to make rebuilds faster.

The real purpose for layer.Reset() is to ensure that your rebuilds are not dirtied by previous builds when they shouldn't be. It makes the author have to be more intentional about keeping their image clean and gives them the ability to ensure cleanliness.

dmikusa commented 2 years ago

Gotcha. Makes sense. I think this is probably a new pattern in libcnb. Previously, with the layer contributors wrapped handled some of these decisions. Thanks

samj1912 commented 2 years ago

@ForestEckhardt I have a couple of qs. I think the current API in libcnb is a bit of a mistake where we allow the execd field to be public writeable. It should ideally be read only. Given the above, we should not be changing that during reset and ideally also fixing this in v2 in a subsequent PR.

Could you make it so that reset doesn't change the execd path?

samj1912 commented 2 years ago

I am curious what y'all think about an API like this -> https://samj1912.github.io/python-libcnb/api/#libcnb._layers.Layer.load

Do we also need an equivalent load function in libcnb as well? (Maybe in a subsequent PR?)

ForestEckhardt commented 2 years ago

I am curious what y'all think about an API like this -> https://samj1912.github.io/python-libcnb/api/#libcnb._layers.Layer.load

Do we also need an equivalent load function in libcnb as well? (Maybe in a subsequent PR?)

I think that this is what layers.Layer does. It does however fail to load the previous environment variables which is something we should address.

samj1912 commented 2 years ago

I am curious what y'all think about an API like this -> samj1912.github.io/python-libcnb/api/#libcnb._layers.Layer.load Do we also need an equivalent load function in libcnb as well? (Maybe in a subsequent PR?)

I think that this is what layers.Layer does. It does however fail to load the previous environment variables which is something we should address.

Do we also want to load profile apart from env vars?