Provide mechanism to indicate language runtime type for an artifact

briandealwis commented 5 years ago

skaffold debug currently uses a set of heuristics to guess the language runtime used in a container image:

We guess the runtime technology for an image container by examining the referenced image container configuration, specifically looking at the environment variables and command-line. It turns out that most of the language runtime base images define a XXXX_VERSION environment variable or use a well-known command-name (java or node or nodemon).

These heuristics fail for Go and C++ programs. Although Go does have some runtime environment variables, they aren't always used. And these heuristics may result in false-positives for mixed-runtime images. We need a way to define the language runtime for a particular artifact.

Proposal

Allow setting hints via an environment variable: SKAFFOLD_RUNTIME?
Allow providing hints via a container image label/annotation: skaffold.dev/debug/runtime
Allow providing hints via a k8s manifest label/annotation: skaffold.dev/debug/runtime
Allow setting the language runtime type in skaffold.yaml:
```
artifacts:
- image: gcr.io/foo/bar
runtimeType: go
```

corneliusweig commented 5 years ago

Sounds like a good idea. Some thoughts:

An environment variable can only be set once, so for multi-artifact projects with different technologies per artifact, this would fail miserably. If the env variable would be extended to SKAFFOLD_RUNTIME_<IMAGENAME> this could work, but I think this is too complicated. In essence, I would not support that setting via env variable (and also via command line for the same reasons)
The annotation is a good idea and I think Skaffold should add these labels. However, it goes against the usual Skaffold way of expecting those labels to be added in k8s manifests. Usually configuration either comes from skaffold.yaml or the command line.
The last suggestion to put this in skaffold.yaml seems the cleanest approach to me. I'm not entirely sure about the config name though. As is, it does not make explicit that this only relevant for debugging and users may get confused. What about debugRuntime or debugType?

Maybe you should also involve the IDE people here. They may have preferences and ideas how this can best work together with CloudCode.

briandealwis commented 5 years ago

Environment variables are on a per-artifact basis (well, per-container really), so that's not really a concern. skaffold debug should allow any of these approaches so that the developer can do whatever is most convenient for them; the recommended practice though is the skaffold.yaml.

etanshaul commented 5 years ago

Environment variables are on a per-artifact basis (well, per-container really), so that's not really a concern

I think I also interpreted it like @corneliusweig, where the env var would be set on the Skaffold process. Whereas I think here you mean to set it in the k8s config.

etanshaul commented 5 years ago

My intuition is that adding this to skaffold.yaml makes most sense as well. This is a configuration option specifically intended for Skaffold, and as such, seems like it belongs in the skaffold configuration.

The easiest solution from an IDE perspective would be to have this as a command flag - but given that it needs to be on a per artifact basis, it's probably infeasible?

@iantalarico for comment as well

briandealwis commented 5 years ago

Ah sorry. I meant either Dockerfile ENV or k8s container env flags.

I think a command-line flag can be done, and debug needs to support something like that for #2184. I think specifying a path to the artifact should be sufficient:

skaffold debug --type /path/to/artifact=go --type /path/to/other/artifact=c++

etanshaul commented 5 years ago

ah right. that would work. Just to be explicit, even though it was probably clear - this would be the easiest option from the IDE perspective because it wouldn't involve modifying any configuration. Instead we could just pass this "runtime type" at command execution time.

Side question: if a runtime type isn't supplied, and skaffold debug fails to do its inference, presumably entire skaffold session won't fail, right? Instead it will just not apply any debug transforms. Similarly it would be nice if this failure to debug was output as part of the enhanced debug event api.

briandealwis commented 5 years ago

I haven't looked into how to hook into the events yet, but I figured I'd just return a map of artifact → debug configuration (runtime-type + debug ports). And indeterminate artifacts would be a null mapping.

corneliusweig commented 5 years ago

Ah sorry. I meant either Dockerfile ENV or k8s container env flags.

That makes perfectly sense. Do you think it's worth adding a note to your original issue post?

I think the standard way to identify an artifact in Skaffold is to use the image name and not the path. That has the advantage of being stable wrt. refactorings (unless the image name is changed :/), so what do you think about:

skaffold debug --type gcr.io/skaffold/getting-started=go

corneliusweig commented 5 years ago

While looking at #2184 I just had another idea how the runtime type could be specified elegantly on the cmdline: the switch to enable debugging could also be used to specify the runtime

skaffold debug --debug-artifact gcr.io/skaffold/getting-started=go

This would save typing/pasting of the image name and is equally expressive (the separator may be up to debate).

One drawback: it will no longer be possible to specify the runtime without restricting the debugged artifacts.

iantalarico commented 5 years ago

Jumping in a little late but I agree that passing the information to skaffold would be the easiest from the IDE side.

briandealwis commented 5 years ago

You're right @corneliusweig the image tag is better since several artifacts can be mapped to the same location, as happens with Jib multi-module projects.

etanshaul commented 5 years ago

using the image tag should work fine from the IDE perspective.

etanshaul commented 5 years ago

I haven't looked into how to hook into the events yet, but I figured I'd just return a map of artifact → debug configuration (runtime-type + debug ports). And indeterminate artifacts would be a null mapping.

That works. Having it return a null mapping, but still have the artifact information, will be helpful for the client.

balopat commented 5 years ago

Just rephrasing so that I understand: We need to enrich the heuristics in runtimes/artifacts so that skaffold can detect what language runtime an artifact is using, so that skaffold debug can do the right "magic" of rewriting the manifests.

Another way I would phrase this: With this direction we prescribe a set of standard mechanisms to make a runtime image Skaffold Debuggable

Allow setting hints via an environment variable: SKAFFOLD_RUNTIME?

+1 on env vars. I think this is a natural fit with the rest of the heuristics and eventually can become a standard thing across Dockerhub images as well, if we wanted to make an effort - we could raise a PR for all of them to include this ;)

Allow providing hints via a container image label/annotation: skaffold.dev/debug/runtime

I'm okay with this, but not convinced. One benefit I see, is that the Dockerfile doesn't have to change for the image. But that also means that if we'd ever inject a skaffold utility inside the container, it wouldn't know about the runtime. What is your thinking around the benefits for this mechanism?

Allow providing hints via a k8s manifest label/annotation: skaffold.dev/debug/runtime

I wouldn't expose this as part of the manifests. It is redundant, and also kind-of-in-the-middle: We are talking about images to be connected to skaffold. It is redundant because with env vars you can simulate this even in a pod spec. It is in the middle: on the image end we can use env vars to define runtimes, on the skaffold end in the artifact config directly.

Allow setting the language runtime type in skaffold.yaml:

Big plus on this one. As I mentioned, this feature connects images to skaffold. Having the ability to define it/override it in skaffold.yaml makes sense to me.

I'm luke warm on the ability to pass it in as a flag - it would be useful for IDEs as then you could override things without touching the yaml - however the IDE would need to store it somewhere...so why not just create a debug profile that has the override?

balopat commented 5 years ago

A crazy idea: we could have a set of known image names -> runtimes mapping themselves listed explicitly somewhere as well as another source of heuristics.

briandealwis commented 4 years ago

This could be a good way to allow specifying an Alpine-based Delve image (GoogleContainerTools/container-debug-support#30).

jkleckner commented 4 years ago

It also seems to me that skaffold.yaml would be the right place to annotate the runtime explicitly.

briandealwis commented 4 years ago

I'm going to add an runtimes attribute to artifacts. Multiple runtimes allows expressing needs like Go-on-alpine (runtimes: ["go","alpine"]). It turns out I need this for Buildpacks support.

eikemeier commented 4 years ago

I'm going to add an runtimes attribute to artifacts. Multiple runtimes allows expressing needs like Go-on-alpine (runtimes: ["go","alpine"]). It turns out I need this for Buildpacks support.

v2beta2 is published, but is identical to v2beta1 :grin:. Will this be in v2beta3?

nkubala commented 4 years ago

@eikemeier 🤦

yes hopefully so. the PR is still in draft, but we don't have any other config changes in the works to my knowledge, so it should go out with the next config bump. however, that isn't tied to our next release necessarily. in any case, stay tuned and we'll get this shipped ASAP.

tejal29 commented 4 years ago

@briandealwis going through quarterly scrub and just wanted to make sure if this is still a priority.

akostic-kostile commented 3 years ago

This would be a very nice addition to debug mode. Currently debug is very inflexible, I'd even say borderline unusable. Also I wouldn't stop at just specifying the runtime, it would be very useful to disable debug for certain artifacts as well. I like this solution from a different issue: https://github.com/GoogleContainerTools/skaffold/issues/2184#issuecomment-498429195

briandealwis commented 3 years ago

@akostic-kostile we'd appreciate some examples of where you're being blocked!

akostic-kostile commented 3 years ago

@briandealwis Sure thing, I have a perfect example for you. One of the modules I'm working on is really a pod that consists of 2 containers. I'll post Dockerfiles just so you get the idea what we're working with here:

reporting-app (file cut for brevity, I just pasted parts that are important for local dev):

FROM mcr.microsoft.com/dotnet/sdk:5.0.401-buster-slim as base

WORKDIR /src/

## Install Sonar Scanner and Java (required for Sonar Scanner), libgdiplus (required for Simulsoft to work in local dev)
ENV JAVA_VERSION=11.0.12+7-2~deb10u1 \
        LIBGDIPLUS_VERSION=4.2-2 \
        SONAR_SCANNER_VERSION=5.3.1
RUN apt-get update \
        && apt-get install -y --no-install-recommends openjdk-11-jre="$JAVA_VERSION" libgdiplus="$LIBGDIPLUS_VERSION" \
        && dotnet tool install --global dotnet-sonarscanner --version "$SONAR_SCANNER_VERSION" \
        && rm -rf /var/lib/apt/lists/*
ENV PATH="$PATH:/root/.dotnet/tools"

ENV DOTNET_CONFIGURATION=Release \
    DOTNET_FRAMEWORK=net5.0 \
    DOTNET_RUNTIME=linux-x64
# cache dotnet restore in a separate layer to speed up builds
COPY *.sln */*.csproj nuget.config ./
RUN for file in $(ls *.csproj); do mkdir -p ${file%.*} && mv $file ${file%.*}/; done \
        && dotnet restore --runtime "$DOTNET_RUNTIME" ThirdEyes.sln

COPY . .

FROM base as local

ARG DOTNET_WATCH_ARGS
ENV DOTNET_WATCH_ARGS=$DOTNET_WATCH_ARGS

EXPOSE 5002
CMD [ "/bin/sh", "-exc", "dotnet watch --project Reporting.App -- $DOTNET_WATCH_ARGS " ]

Highcharts-export-server (only thing that gets installed with npm install is "highcharts-export-server": "^2.1.0"):

FROM node:12-slim

RUN apt-get update && apt-get install -y --no-install-recommends bzip2 libfontconfig1 && rm -rf /var/lib/apt/lists/*

WORKDIR /home/node
COPY package.json package-lock.json ./

ENV ACCEPT_HIGHCHARTS_LICENSE=1
RUN npm install

USER 1000:1000
EXPOSE 7801
CMD ["/home/node/node_modules/.bin/highcharts-export-server", "--enableServer", "1"]

And finally skaffold.yml:

apiVersion: skaffold/v2beta21
kind: Config
metadata:
  name: _reporting
profiles:
  - name: local
    build:
      artifacts:
        - image: registry.localhost:8082/reporting-app
          context: "."
          docker:
            dockerfile: Reporting.App/Dockerfile
            buildArgs:
              DOTNET_WATCH_ARGS: run
            # noCache: true
            target: local
          sync:
            manual:
              - dest: "."
                src: "**/*.cs"
        - image: registry.localhost:8082/highcharts-export-server
          context: Reporting.App/HighchartsExportServer
      insecureRegistries:
        - registry.localhost:8082
      local:
        concurrency: 0
        push: true
        tryImportMissing: true
        useBuildkit: true
      tagPolicy:
        ## for local dev only inputDigest policy makes sense as it will rebuild image if there are changes in workdir
        ## gitCommit policy only takes into consideration committed changes.
        inputDigest: {}
    deploy:
      kustomize:
        defaultNamespace: default
        paths:
          - ../kubernetes-manifests/local-cluster/default/reporting/
      statusCheck: true
      statusCheckDeadlineSeconds: 60
    portForward:
      - resourceType: Service
        resourceName: reporting-svc-local
        namespace: default
        port: 5002
        localPort: 5002
      - resourceType: Service
        resourceName: reporting-svc-local
        namespace: default
        port: 7801
        localPort: 7801

Now, what happens when I do skaffold debug --profile local --module _reporting ? Couple of things, almost all of them wrong. :)

Lets take a look at pod annotations:

metadata:
  annotations:
    debug.cloud.google.com/config: '{"highcharts":{"artifact":"registry.localhost:8082/highcharts-export-server","runtime":"nodejs","workingD
ir":"/home/node","ports":{"devtools":9229}},"reporting":{"artifact":"registry.localhost:8082/reporting-app","runtime":"jvm","workingDir":"/sr
c/","ports":{"jdwp":5005}}}'

For highcharts container runtime was correctly identified, but the problem is I don't care about debugging Highcharts export server. Is there a way to tell Skaffold not to debug that container? The only way I know of is to manually set pod level annotation to debug.cloud.google.com/config: {}, however this is far from ideal. It feels very awkward to pass configuration options to Skaffold through pod annotations, second problem is this will also stop me from debugging reporting container.

Leaving that aside for now, lets take a look at what Skaffold detected for a second container. Hm, runtime jvm?! Must be because I have a variable inside Dockerfile called JAVA_VERSION. I'm installing Java so SonarQube scanner would work correctly, and JAVA_VERSION is one of the special variables Skaffold uses to detect runtime. However in this case it completely misses the mark. Can I change the variable name to something else? Sure I can, but I created this Dockerfile before I started working with skaffold debug, I'm just trying to make a point here that current methods of detecting runtime are far from ideal

To conclude this long post there should definitely be some way to set these things from skaffold.yml, at the very least to enable/disable debugging on container level and a way to specify runtime so correct debug tools would be installed. At version 1.32 debug is very rough around the edges.

gsquared94 commented 3 years ago

Created issue #6713 from above comment

anthonyalayo commented 2 years ago

I'm using Amazon Corretto along with a maven wrapper command, and unfortunately this also fails the heuristics for determining a java runtime. Being able to explicitly tell skaffold what we want would be fantastic.

As a workaround, would it be possible for skaffold to key off of JAVA_HOME instead of JAVA_VERSION? Original PR here

Since openjdk images are officially deprecated, it would be nice for this to work with others.

anthonyalayo commented 2 years ago

Another issue I hit that would be solved with the support here would be this: https://stackoverflow.com/questions/72348615/how-to-debug-a-jdk-docker-container-in-intellij-idea

Running with the java spring boot plugin is quite common in order to get hot class reloading. The current process for skaffold to automatically add debug port forwarding conflicts with this setup.

I'm not sure how much this crosses the boundary between skaffold and cloud code, but in order to use the extension, I had to do the following workaround:

ENV JAVA_VERSION=11.0.16
CMD JAVA_TOOL_OPTIONS="" ./mvnw spring-boot:run -Dspring-boot.run.jvmArguments="-Xdebug $JAVA_TOOL_OPTIONS"

Where I had to

Add the JAVA_VERSION environment variable to get ports automatically added to manifests
Blank out the automatically injected JAVA_TOOL_OPTIONS for the run command
Add JAVA_TOOL_OPTIONS into spring boot's plugin arguments

If there was a flag I could add to the skaffold.yaml file that says "add java debug ports", I could replace these three steps with just one.

GoogleContainerTools / skaffold

Provide mechanism to indicate language runtime type for an artifact #2203

Proposal