Create `docker_environment` from `docker_image`

ShantanuKumar commented 1 year ago

Is your feature request related to a problem? Please describe. I would like to use a custom docker image generated using docker_image for creating a docker_environment.

Describe the solution you'd like In the docker_environment set name =

Describe alternatives you've considered I didn't know how it would work. But as mentioned by @stuhood on slack

you’d currently have to actually build the image once, and then publish it somewhere

Additional context Doc slack thread

chris-stetter commented 6 months ago

This would be a really helpful feature. Ideally, with an option for specifying target_stage.

sureshjoshi commented 1 month ago

I believe I've got a proof-of-concept on this after a ton of cyclic dependencies, and then cyclic rule graphs errors 😆

As per Stu's recommendation in Slack:

not as part of a build step, no: you’d currently have to actually build the image once, and then publish it somewhere

I kinda took the approach of just doing this. Allow docker_environment to take a dependency on a docker_image target as the data source, and before docker_image can be used as an environment, the associated docker_image is built and used.

Before I can PR:

Are there any huge piles I've just stepped into? I'm hoping not as this isn't a transitive dep from the environment.

The quirk I've run into is needing to "escape" the docker_environment while building the underlying docker_image to break out of a cyclic dependency (I use __local__ in my demo, but maybe there can be more possible build locations).

What would be a decent API as a consumer?

The "easy" answer is using the Dependencies field we have everywhere else, but that takes in a list - so not sure if that makes semantic sense (other than allowing multiple images to be pre-built).

In this case, the image field would either need to explicitly be specified by name, or we would update it at runtime (but then renders the "list of dependencies" confusing).

docker_image(
  name="my-docker-image".
  instructions="..."
)

docker_environment(
  name="my-docker-env",
  dependencies=[":my-docker-image", ...], # <-- There can technically be multiple, and we pre-build all of them - I think we'd want to enforce these are explicitly `docker_image`s
  image="my-docker-image", # <-- this gets replaced with the SHA256 at runtime? That might still do a pull from the registry, need to look into it and `bollard` - needs some thought on implementation still
)

Alternative, overload the image field to accept Pants targets - so if we identify a Pants target, we would build that first, and then replace the image with the generated image name.

docker_image(
  name="my-docker-image".
  instructions="..."
)

docker_environment(
  name="my-docker-env",
  image=":my-docker-image", # <-- This is built and replaced at runtime - this feels the most Pantsish. Could be a problem if Docker images allow a colon or slash prefix in their labels, need to check tonight 
)

Lastly, we could add a new field like dependsOn or builtFrom or something - but adding fields is never my preferred option (like, it hurts my soul).

sureshjoshi commented 1 month ago

@tdyas @stuhood @kaos ^^

CC'ing you all, as you've all got more context on both the Docker backend and Environments

tdyas commented 1 month ago

Before I can PR: Are there any huge piles I've just stepped into? I'm hoping not as this isn't a transitive dep from the environment.

I would need to see the draft PR before I could opine on what piles have actually been stepped in. (E.g., I don't have enough context to guess at what the cyclical dependencies were.)

kaos commented 1 month ago

I've not much to say about the cyclic dependencies, other than GLHF ;) I think, build any images found in dependencies (but don't restrict dependencies to only be allowed to be docker_image targets..) and use the first one built that way as the environment image unless image is provided as something else. (i.e. leave image as None to use the just built docker_image from dependencies, and treat any provided image value as today without fiddling with it.) my $.02

sureshjoshi commented 1 month ago

build any images found in dependencies (but don't restrict dependencies to only be allowed to be docker_image targets

What would be the purpose of allowing non-docker deps if we wouldn't use them in the environment? I think if we ever did want to use them, we could remove the limitation 🤷🏽 My API usage assumption is that any dependency in an environment MUST end up available in that environment - which wouldn't be the case here

i.e. leave image as None to use the just built docker_image from dependencies, and treat any provided image value as today without fiddling with it.

Making one dependency special by order, I think, is a bit risky to me - might end up with some really weird behaviour. I also don't think it's a known convention everywhere in Pants.

Building on your ideas though, what about:

Build all docker_images found in dependencies
If there is exactly one dep, use it in image (e.g. image can be None)
If there is more than one dep, image must be specified (e.g. image cannot be None)
If there is more than one dep, and image is not specified, fail with a warning

The place where I could see this get weird is like, what if a user does this:

docker_image(
  name="my-docker-image"
)

docker_environment(
  name="my-docker-environment",
  dependencies=[":my-docker-image", ...],
  image="nginx:latest"
)

Nothing wrong with this technically, but to me, this feels like a mistake, no?

The more I think about it, this makes the most sense from the API and keeps everything obvious/explicit, but it's not as extensible technically (but, like, should it be? Do we have a use case for hijacking a lazily instantiated docker_environment to build a list of dependencies? Shouldn't this be what docker_image dependencies is for?)

docker_environment(
  name="my-docker-environment",
  image="//:my-docker-image"
)

One benefit of this way is that, it seems like something we'd want to have eventually anyways (as per my above case, of specifying a single item in a list of dependencies field).

In fact, the ONLY problem I can see is if //: or : are valid docker image prefixes AND that they're used as a convention in the docker community. I've definitely never seen an image with that name 😄

kaos commented 1 month ago

Requiring a target address be rooted with // makes sense to me (I agree, I don't think this would ever be a valid docker image name). Regarding the dependencies, it could be for invalidation or other things we've not considered here.. but keeping it simple, and put the target address into the image field feels more obvious/explicit 👍🏽

pantsbuild / pants

Create `docker_environment` from `docker_image` #17714