buildpacks / lifecycle

Reference implementation of the Cloud Native Buildpacks lifecycle
https://buildpacks.io
Apache License 2.0
186 stars 105 forks source link

Exporter: fix handling for run images with null Config.Env.PATH #384

Closed micahyoung closed 4 years ago

micahyoung commented 4 years ago

Summary

Platform 0.4 exposed a bug with Windows images where an exported image with ENTRYPOINT set to c:\cnb\lifecycle\launcher.exe (pack's default) are unrunnable because launcher tries to run subcommands but fails to find cmd in the PATH at c:\Windows\system32\cmd.exe.

docker run --rm sample-hello-world-windows-app:nanoserver-1809
ERROR: failed to launch: cmd execute: exec: "cmd": executable file not found in %PATH%

Example app image: micahyoung/sample-hello-world-windows-app:nanoserver-1809


Reproduction

Steps
  1. run make build-windows build-windows-app from this samples PR commit
Current behavior
  1. attempt to run app

    docker run --rm sample-hello-world-windows-app:nanoserver-1809
    ERROR: failed to launch: cmd execute: exec: "cmd": executable file not found in %PATH%
  2. inspect

    docker inspect sample-hello-world-windows-app:nanoserver-1809
    ...
    [
    {
        ...
        "Config": {
            "Env": [
                "PATH=c:\\cnb\\process;c:\\cnb\\lifecycle;",
                 ...
            ],
            ...
    }
    ]
    ...
Expected
  1. attempt to run app
    docker run --rm sample-hello-world-windows-app:nanoserver-1809
    # successful run message
  2. Unsure what the inspected image config should look like (see potential solutions at bottom)

Context

Pack issue: https://github.com/buildpacks/pack/issues/800 Related moby issues/unimplemented PRs:

This appears to happen because the exported image config's Config.Env.PATH only has CNB paths (c:\cnb\process;c:\cnb\lifecycle) and no default Windows paths (c:\Windows\system32;C:\Windows), making it impossible for lifecycle.exe to find cmd.exe.

When Docker Windows encounters and an empty Config.Env.PATH it loads the path from the image's Windows reg Hives at HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\Session Manager\Environment\PATH docs) resulting in c:\Windows\system32;C:\Windows + any other hives entries added in subsequent layers.

Whenever Config.Env.PATH is set, Docker Windows doesn't load the registry and only uses the value from the config, which is what causes this issue.

Example showing typically empty Config.Env (including Config.Env.PATH) on all windows base images.

docker inspect mcr.microsoft.com/windows/nanoserver:1809-amd64 | jq .[].Config
{
  # note: no "env"
  "User": "ContainerUser",
  "Cmd": [
    "c:\\windows\\system32\\cmd.exe"
  ]
} 

docker inspect mcr.microsoft.com/windows/servercore:1809-amd64 | jq .[].Config
{
  # note: no "Env"
  "User": null,
  "Cmd": [
    "c:\\windows\\system32\\cmd.exe"
  ]
}

docker inspect mcr.microsoft.com/windows:1809-amd64 | jq .[].Config
{
  # note: no "Env"
  "User": null,
  "Cmd": [
    "c:\\windows\\system32\\cmd.exe"
  ]
}

... whereas Linux images always set the full PATH

docker inspect alpine | jq .[].Config
{
...
  "Env": [
    "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
  ],
...
}
lifecycle version

pack inspect-builder cnbs/sample-builder:nanoserver-1809

Inspecting builder: cnbs/sample-builder:nanoserver-1809

REMOTE:

Created By:
  Name: Pack CLI
  Version: 0.0.0+git-5624911

Trusted: No

Stack:
  ID: io.buildpacks.samples.stacks.nanoserver-1809

Lifecycle:
  Version: 0.7.5
  Buildpack APIs:
    Deprecated: (none)
    Supported: 0.2
  Platform APIs:
    Deprecated: (none)
    Supported: 0.3

Run Images:
  cnbs/sample-stack-run:nanoserver-1809

Buildpacks:
  (none)

Detection Order:
  (none)

Warning: cnbs/sample-builder:nanoserver-1809 has no buildpacks
Warning: Users must supply buildpacks from the host machine
Warning: cnbs/sample-builder:nanoserver-1809 does not specify detection order
Warning: Users must build with explicitly specified buildpacks

LOCAL:

Created By:
  Name: Pack CLI
  Version: 0.0.0+git-5e9a21f

Trusted: No

Stack:
  ID: io.buildpacks.samples.stacks.nanoserver-1809

Lifecycle:
  Version: 0.9.0
  Buildpack APIs:
    Deprecated: (none)
    Supported: 0.2, 0.3, 0.4
  Platform APIs:
    Deprecated: (none)
    Supported: 0.3, 0.4

Run Images:
  cnbs/sample-stack-run:nanoserver-1809

Buildpacks:
  ID                                 VERSION        HOMEPAGE
  samples/hello-world-windows        0.0.1          https://github.com/buildpacks/samples/tree/main/buildpacks/hello-world-windows

Detection Order:
 └ Group #1:
    └ samples/hello-world-windows@0.0.1  
platform version(s)

pack report

Pack:
  Version:  0.0.0+git-5e9a21f
  OS/Arch:  darwin/amd64

Default Lifecycle Version:  0.9.0

Config:
  experimental = true

docker info

Client:
 Debug Mode: false
 Plugins:
  mutagen: Synchronize files with Docker Desktop (Docker Inc., testing)
  app: Docker Application (Docker Inc., v0.8.0)
  buildx: Build with BuildKit (Docker Inc., v0.3.1-tp-docker)

Server:
 Containers: 49
  Running: 0
  Paused: 0
  Stopped: 49
 Images: 199
 Server Version: 19.03.12
 Storage Driver: windowsfilter
  Windows: 
 Logging Driver: json-file
 Plugins:
  Volume: local
  Network: ics internal l2bridge l2tunnel nat null overlay private transparent
  Log: awslogs etwlogs fluentd gcplogs gelf json-file local logentries splunk syslog
 Swarm: inactive
 Default Isolation: process
 Kernel Version: 10.0 17763 (17763.1.amd64fre.rs5_release.180914-1434)
 Operating System: Windows 10 Enterprise LTSC 2019 Version 1809 (OS Build 17763.1397)
 OSType: windows
 Architecture: x86_64
 CPUs: 2
 Total Memory: 7.999GiB
 Name: docker-windows
 ID: 3TBF:MDBO:EM3I:4WVQ:E5ZZ:X7HR:FOHL:KII3:XGGC:4QQH:3E72:ZB5O
 Docker Root Dir: C:\ProgramData\Docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  0.0.0.0/0
  127.0.0.0/8
 Live Restore Enabled: false
 Product License: Community Engine
anything else?

One temporary workaround workaround we found is to set ENV PATH c:\\Windows\\system32;C:\\Windows in the run image Dockerfile (or otherwise modify the config on the run image), and this makes launcher able to run:

docker inspect sample-hello-world-windows-app:nanoserver-1809 | jq .[].Config.Env
[
  "CNB_USER_ID=1",
  "CNB_GROUP_ID=1",
  "PATH=c:\\Windows\\system32;C:\\Windows",
  "CNB_LAYERS_DIR=c:\\layers",
  "CNB_APP_DIR=c:\\workspace",
  "CNB_PLATFORM_API=0.4",
  "CNB_DEPRECATION_MODE=quiet",
  "PATH=c:\\cnb\\process;c:\\cnb\\lifecycle;c:\\Windows\\system32;C:\\Windows"
]
micahyoung commented 4 years ago
Potential solutions

I feel like the best thing to do for now is keep using Config.Env.PATH and existing CNB conventions for handling PATH environment variable modification, and continue ignoring any Windows reg PATH environment variable that might be present. Stack and buildpack authors already have plenty of tools from pack and lifecycle to modify PATH if they need to.

Additionally, I'd still like to add more formal Window reg handling and I feel like it's premature to start relying on the existing Windows reg behavior too much until it's spec'd out.

So if we follow that opinion, there's these potential approaches:

  1. Require all stacks to define their full, explicit Config.Env.PATH (Dockerfile: ENV PATH c:\\Windows\\system32;C:\\Windows just like Linux
    • no lifecycle change, just documentation (maybe light validation in exporter)
  2. Or, Modify exporter to always append ;c:\Windows\system32;C:\Windows to the exported images Config.Env.PATH
    • minimal change but may need to be spec'd
  3. Or, Modify launcher to always append ;c:\Windows\system32;C:\Windows to its own PATH environment variable (when not already present) before executing any sub commands.
    • minimal change but may need to be spec'd
jromero commented 4 years ago

I wonder if this issue is related to another issue reported from @nebhale on slack.

docker run -it -p 8080:8080 --entrypoint web --env ALPHA="bravo" applications/jar        
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "exec: \"web\": executable file not found in $PATH": unknown.

Through the thread it was discovered that env vars are being duplicated:

           "Env": [
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                "CNB_LAYERS_DIR=/layers",
                "CNB_APP_DIR=/workspace",
                "CNB_PLATFORM_API=0.4",
                "CNB_DEPRECATION_MODE=quiet",
                "PATH=/cnb/process:/cnb/lifecycle:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
            ],
jromero commented 4 years ago

^ potentially related to #385???

nebhale commented 4 years ago

It's exactly #385. This will be solved by the imgutil changes.

yaelharel commented 4 years ago

I think this issue can be closed now since https://github.com/buildpacks/lifecycle/issues/385 was closed (https://github.com/buildpacks/imgutil/pull/56 and https://github.com/buildpacks/lifecycle/pull/387 were merged).

micahyoung commented 4 years ago

Hey @yaelharel. I'll check this morning that we can use stacks Windows stacks with a null Config.Env.PATH. If so, we can close

ekcasey commented 4 years ago

I suspect that this is a slightly different issue. When PATH is unset in a windows image config file I suspect it is either being set in /Hives or there is some sort of OS-level defaulting behavior. We want to prepend /cnb/process and /cnb/lifecycle to the path without blowing away the original value. So we either need folks to set the original value in the place we exoect (Config.Env.PATH) or find a way to grab the original value so that we can prepend to it.

micahyoung commented 4 years ago

Yeah, I just confirmed using https://github.com/buildpacks/samples/pull/85 but with the newest lifecycle appended to builders/nanoserver-1809/builder.toml and with the explicit PATH in the stack commented out, it fails with:

make build-windows build-windows-apps
...
docker run --rm sample-batch-script-app:nanoserver-1809
ERROR: failed to launch: cmd execute: exec: "cmd": executable file not found in %PATH%
micahyoung commented 4 years ago

@ekcasey I feel like changing lifecycle to read the Hives-PATH from the run-image and prepending them to Config.Env.PATH is the likely the right place to end up. But to start, would be ok requiring stack authors to manually duplicate whatever is in the Hives-PATH into an explicit Config.Env.PATH (aka ENV PATH ... Dockerfile directive) on their run-image to start with?

I feel like that works already today and I worry that parsing Hives/*_Delta files is going to be tricky as there's no OSS tooling nor documentation for working with those files. They are much easier to work with in-container at runtime (using Windows Reg APIs) and thats what a lot of this wip spec is based on.

For now we could rely on the fact that, even though most stacks are currently defined in Dockerfiles today, they are much less dynamic than any-old Dockerfile in practice and if we had stack authors set an ENV PATH once, it will be unlikely to get out of sync with the a Hives-PATH , especially given the ENV, FROM <base image>, and RUN directives that would alter the Hives-PATH are in the same Dockerfile.

To start, what if we add some simple validation in creator and launcher to use the Windows reg APIs to compare the two PATH values at runtime and if out of sync, print warnings for the exact values Config.Env.PATH to use?

ekcasey commented 4 years ago

@micahyoung That all sounds reasonable to me. To resolve this issue we should read the Hives path if Config.Env.Path is unset (seems like Config.Env.Path take precedence) and prepend the lifecycle and process dirs to that path.

I think the remaining question is: if Config.Env.Path is unset on the run-image, do we want to set the PATH using Config.Env.Path or do we want to contribute a Hive delta?

micahyoung commented 4 years ago

It's feeling like this will all need to be spec'd out but here's what I'm thinking:

We do everything in-container, in-process in creator/launcher executables at runtime: before executing any shell/sub-processes, if their own current env var PATH does not contain the value of Windows reg HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\Session Manager\Environment\PATH, then they append that value to their own current, runtime env var PATH (which would already have lifecycle/process or other dirs that Docker injected from Config.Env.PATH). Then creator/launcher can find cmd on any subcommands in the same process. This means Config.Env.PATH on build/run/builder images would never contain c:\windows\system32;c:\windows - those values would always come from the Windows reg at runtime.

Export continues to set Config.Env.PATH as it does today with process/lifecycle variables, whether it's empty or not. If it's empty, Docker already automatically set's launcher's env var PATH from the registry, so launcher reads the Windows reg value, compares it, and will not change it as describe above. If Config.Env.PATH is empty, launcher's env var PATH will not yet contain the Windows reg value, so launcher reads the Windows reg value, compares it, and appends it.

The alternative, having platforms or lifecycle read values from Hives/*_Delta format from outside and appending to and image's Config.Env.PATH or via new Hives/*_Delta entries, would be tricky:

micahyoung commented 4 years ago

I added a Draft PR of my previous thinking, basically "at runtime, the lifecycle/launcher merge their own in-memory environment variables with their Windows registry env vars, ignoring duplicates"

https://github.com/buildpacks/lifecycle/pull/402

Everything seems to work - ENV PATH... is no longer need in build/run images. I'm wondering if this needs to be spec'd first though.

ekcasey commented 4 years ago

Although the solution proposed by @micahyoung would solve the PATH problem for processes started by the launcher, images still would not have a sensible PATH by default which may result in a bad experience if users want to the start a container without the launcher for any reason. Require windows stack authors (a smaller pool of more deeply engaged users) to set PATH may result in a better experience all-around.

After discussion we have decided not to make this fix for now and instead propose specifying the requirement to set PATH on windows run images.