Open Chuxel opened 2 years ago
Another possibility here would be to allow labels to set devcontainer.json properties as a general mechanism, and then specific embedded models for scenarios that warrant it or support it. This would also allow encoding of this information in images to improve distribution of pre-build images. When the dev container CLI is used to build the image, these labels would be added automatically, but we can support them as straight labels as well (whether in a Dockerfile or an orchestrator format).
The json-based nature of devcontainer.json would main this fairly straight forward. Common and less complex properties could be referenced directly. A modified array syntax could be supported that does not require quoting (if there's no comma in a value) to make the use of some of them less complicated.
LABEL com.microsoft.devcontainer.userEnvProbe="loginInteractiveShell"
LABEL com.microsoft.devcontainer.vscode.extensions="[ms-python.python,ms-toolsai.jupyter]"
More complex any
type properties could then be encoded json. You see this commonly in a number of places like images generated via the pack
CLI for Buildpacks.
LABEL com.microsoft.devcontainer.vscode.settings="{\"some.setting\": \"some-value\"}"
In the common case, these labels would be automatically added by the dev container CLI to the image when it is pre-built. (devcontainer build ...
), but manual entry would enable these additions to be embedded in orchestrator formats as well.
services:
app:
build:
context: .
dockerfile: Dockerfile
labels:
- "com.microsoft.devcontainer.userEnvProbe=loginInteractiveShell"
- "com.microsoft.devcontainer.vscode.extensions=[ms-python.python,ms-toolsai.jupyter]"
Any tool that supports the dev container spec would then look for these image labels - regardless of whether they are on a pre-built image or one built by an orchestrator. The reference implementation would then illustrate how to make this happen.
We should be able make this work with any general, lifecycle, or tool specific property.
For dev container features, however, I'd also propose another property that we add to images only that indicates whether the build step is already done for the image.
LABEL com.microsoft.devcontainer.features.built="[docker-in-docker,github-cli]"
The features metadata label can then continue to be present to provide visibility to what was already applied.
Furthermore, I think we should render out the devcontainer.json properties that tie to the feature in the resulting image. e.g., extensions would include those that were indicated by the feature. Properties like capAdd
we've talked about as higher level properties, so these could be handled the same way.
Processing then is always: build (optional), read labels with metadata, run. You can use devcontainer.json as you use it today, but the labels could also come from somewhere else.
Net-net, the resulting image should have labels on it that explains how it should be set up. Orchestrator formats can influence processing where they support adding properties. In all cases, the dev container CLI will inspect the image and make the needed adjustments.
Finally - this should help with the one-to-many problem. For pre-building, each image can have a separate devcontainer.json. When you're not pre-building an image, you can consolidate in an orchestrator format instead.
Thoughts @chmarti @jkeech @joshspicer @edgonmsft @bamurtaugh ?
This would also allow encoding of this information in images to improve distribution of pre-build images. When the dev container CLI is used to build the image, these labels would be added automatically, but we can support them as straight labels as well (whether in a Dockerfile or an orchestrator format).
This makes sense to me. I was actually thinking through this same scenario a bit yesterday and was leaning towards the same solution. We will need some mechanism to embed feature devcontainer.json contributions in container images (ideally through metadata such as labels) so that you can prebuild an image that was using features during it's construction and have the rest of the feature contributions kick in at runtime without the end user having to specify those transitive features in their repo's devcontainer.json.
As an example, suppose we replace the codespaces-linux
"kitchensink" image definition to be the ubuntu base + a collection of many features (different versions of python, node, go, etc). When we prebuild the kitchensink image, we want users to be able to directly reference that image tag in their devcontainer and get everything included which those built-in features provide. The Dockerfile contributions would obviously be in the image layers already, but those features might provide non-Dockerfile contributions, such as VS Code extensions/settings, lifecycle script hooks, runArgs
, etc. The devcontainer CLI will need to discover the feature metadata from the prebuilt image and apply everything at runtime. The end user does not need to be aware of which features were used in the construction of the image they are referencing.
@jkeech Yep exactly! I agree that there's value in doing this even for the single container case to get to a point you can just ref an image there.
Is the primary goal of the labels to allow different orchestrators to properly build and connect to containers? i.e.
enable development in containers regardless of how they are orchestrated
a feature needs to be able to be applied to multiple orchestrated containers when their images are built and they are spun up
Based on:
We will need some mechanism to embed feature devcontainer.json contributions in container images (ideally through metadata such as labels) so that you can prebuild an image that was using features during it's construction and have the rest of the feature contributions kick in at runtime
Is a secondary goal of labels to aid in general prebuilding? Or is it more specifically to allow prebuilding across any orchestrator?
I'm trying to understand the main goal(s) of the labels proposal, and if it's essentially an "option 3" (options 1 and 2 to achieve the orchestrator interop goal are listed in original issue), or if it's a pivot/tangent/sub-component of the main orchestrator goal.
Is the primary goal of the labels to allow different orchestrators to properly build and connect to containers? i.e. ... Is a secondary goal of labels to aid in general prebuilding? Or is it more specifically to allow prebuilding across any orchestrator?
@bamurtaugh Yeah the genesis for the proposal was thinking through how we could better:
However, in considering this, there are other problems we could solve:
To some extent, we could move this to a more general proposal given the breadth of benefits as I think about it.
That makes a lot of sense, thanks for the great detail @chuxel!
To some extent, we could move this to a more general proposal given the breadth of benefits as I think about it.
That'd make sense to me 👍. If others also think the labels approach makes sense / is worth exploring further, it feels like it could encompass this topic + the variety of others you've mentioned, and this issue could pivot to focus on it, or we could open another one.
To make this all a bit more concrete, I took the "save June" sample from the codespaces-contrib org and created a few branches that step into this:
runServices
to be only the service
referenced in devcontainer.json. This is needed so that each container is built separately with the features in devcontainer.json.runServices
issue. I added a script called fake-it.sh
that will run on mac and Linux that mocks up this existing. You need to install the dev container CLI via the VS Code command (not the npm package) for it to work. If then run the script, it will spin up two VS Code windows.x-devcontainer
properties to docker-compose.devcontainer.yml
that mirrors the devcontainer.json metadata structure, but can omit things like service
. Here again, there's a fake-it.sh
script that can be used to try it out if you have the devcontainer CLI from the VS Codde extension installed. There's no devcontainer.json file, but the metadata from the spec is still available.docker-compose.devcontainer.yml
file by moving a few properties to the Dockerfile
for each service. It illustrates how even a hybrid model that mixes embedded with labels can simplify things further. As described there, part of the idea here is that, if you used the dev container CLI to pre-build an image with a devcontainer.json file, all of these properties would automatically be in the image, which would even further reduce what needs to be in the docker-compose.devcontainer.yml
file. I did not create a fake-it script here yet.When I compare 3 and 4, you can see the advantages of embedding, and then how 5 has the potential to thin out what would even need to be in the docker-compose.devcontainer.yml file to those things that are truly specific to the orchestrator scenario.
Having a variation of the devcontainer.json that can configure multiple dev containers in a Docker Compose setup makes sense (3).
Moving all of the dev container configuration to a docker-compose.yml removes our "go to" file for dev containers (4). Not sure this is not also disadvantage.
On using image labels (5): This seems to make sense in a broader scope. Should this be in a separate issue? (One thing of note: This probably works best with configuration that is easy to merge, i.e., when the devcontainer.json also touches on the same part of the configuration.)
Moving all of the dev container configuration to a docker-compose.yml removes our "go to" file for dev containers (4). Not sure this is not also disadvantage.
@chrmarti Yeah, this would be one example and not to the exclusion of devcontainer.json per-se. Ideally this is a part of the orchestrator integration code as we get to the point where this is a bit more abstracted. The point here being that we can converge with any format with first class support in places that are natural for those using said format. It wasn't a lot of effort to support (as you can see in the fake it code -- it's a straight conversion to json). The devcontainer.json file still has value both as a way to pre-build images and for the single container scenario it already handles.
I always think about this in terms of where devs would be coming from in a given scenario. If you're already using an orchestrator for multi-container setups, you'll be more inclined to add a few values to what you have rather than learning an entirely new format. If you're coming in cold, you also need to learn two things right now rather than focusing on the orchestrator with a few additions.
On using image labels (5): This seems to make sense in a broader scope. Should this be in a separate issue? (One thing of note: This probably works best with configuration that is easy to merge, i.e., when the devcontainer.json also touches on the same part of the configuration.)
Happy to fork it off. It definitely has broader use than multi-container scenarios.
The devcontainer.json file still has value both as a way to pre-build images and for the single container scenario it already handles.
My impression from 4) was that it had disadvantages when prebuilding:
As before, if you pre-build the devcontainer image and store it in an image registry for performance, its devcontainer.json configuration is still completely disconnected. This makes it very easy to forget something and effectively adds a fourth thing to track in addition to the .devcontainer.json files and docker-compose.devcontainer.yml file.
Does the above mean that a user may add a devcontainer.json to this style of repo to aid in pre-building, but it'd be disconnected from the rest of the config? What are the "the .devcontainer.json files" that need to be tracked (as I only see Compose files), and how do they differ from the other devcontainer.json that'd need to be added?
My impression from 4) was that it had disadvantages when prebuilding ... Does the above mean that a user may add a devcontainer.json to this style of repo to aid in pre-building, but it'd be disconnected from the rest of the config? What are the "the .devcontainer.json files" that need to be tracked (as I only see Compose files), and how do they differ from the other devcontainer.json that'd need to be added?
Since you can pre-build using the dev container CLI, and you can pre-build using docker compose already, there's not really a disadvantage for pre-building per-se. The same problems that exist here exist when using devcontainer.json.
We can, however, make things better with what is described in 5) since you can pre-build the image separately, and just reference it either in devcontainer.json or the docker compose (or other orchestrator file). Put another way, you can better decouple pre-building images from using them. Pre-building the image can be in a completely separate repository - even a common one maintained by an ops team. People using the images do not need to be aware of the devcontainer.json content used to create them.
At that point, you've got fewer unique properties getting added to devcontainer.json / docker-compose / another orchestrator format that need to be added when you are just referencing the image directly. This also helps with sharing config since it's all in the image. Multiple repositories can reference the same image directly with little to no config being present.
So, in summary, currently you have to have a pre-built image and devcontainer.json file go together - and they can version completely independently of one another. If instead these properties are part of the image's label metadata, we can determine what to do purely using the image.
I broke the label part of this proposal out into #18.
Just to update this issue, label support (#18) is now in. Keeping this particular proposal open to cover broader integrations and overall improved support for multi-container scenarios. https://github.com/devcontainers/spec/issues/10#issuecomment-1067002392 includes some example options.
Native Kubernetes integration is one long requested example another format (https://github.com/microsoft/vscode-remote-release/issues/12), but clouds have their own formats
I don't know what is the state of the art now but I have not found a clear paradigm about code versioning and developer environment in a remote setup.
Vscode Kubernetes extensions let to use VsCode remote attaching to an existing container/POD but then you need to have git code in the container (built with the image?, mounted as a volume? git-sync initcontainer with a deploy key?) if you want to edit and commit the code from the remote setup (is it a plausible secure paradigm with multiple developers on the same cluster?).
Google Cloud Code VsCode extension rely on Skaffold to build, deploy and sync files (but still only one way https://github.com/GoogleContainerTools/skaffold/issues/2492). It is hard to use inside a devcontainer as to build the image locally you need to have a docker in docker setup. Also as the sync is one way you need to edit locally and use a remote terminal to run things and debug on the pod after the file watch sync. You can start skaffold without being in a devcontainer but then, as you edit file locally, you not on the same (specular) dev-env of the POD.
So today I still see many problems about how and where to handle the versioning and eventually use of devcontainer to rely on Vscode remote especially for a kuberntes orchestration of the container.
What is you point of view?
Unlike some developer centric formats, a non-goal for devcontainer.json is to become yet another multi-container orchestrator format. Instead, its goal is to enable development in containers regardless of how they are orchestrated. There's a single container / non-orchestrated
build
property, but thedockerComposeFile
property is representative of this desire today (along with support for the "attach" scenario in Remote - Containers, though that's not directly related to the dev container spec). Native Kubernetes integration is one long requested example another format (https://github.com/microsoft/vscode-remote-release/issues/12), but clouds have their own formats, we can expect more to evolve over time, and there's no doubt we could improve interop with Docker Compose.With that in mind, there are two related proposals to consider:
Introducing an
orchestrator
property much likecustomizations
(for #1) that we keep in mind for the reference implementation (#9) where orchestrator specific properties can live.Introduce an extension to the spec that would describe how to "embed"
devcontainer.json
in an orchestrator format. For example, an automated json <=> yaml conversion that enables it to exist in Docker Composex-*
extension attributes (https://github.com/docker/compose/issues/7200).While each new orchestrator would still necessitate updates to the reference implementation (#9) and/or the orchestrator's code, documenting how this could be achieved would help guide the implementation, and keep everyone in a place that will avoid unexpected issues with changes down the road. For example, if the Docker Compose spec ended up having first class support for certain existing devcontainer.json properties, there would still be a known path for those that were not.