buildkite / agent-stack-k8s

Spin up an autoscaling stack of Buildkite Agents on Kubernetes
MIT License
77 stars 30 forks source link

Allow running multiple commands #259

Open bpoland opened 6 months ago

bpoland commented 6 months ago

Hi, I'm just getting started with agent-stack-k8s and wondering if it would be possible to allow running multiple commands instead of just one on the containers? I realize the existing setup is modeled after kubernetes pod spec which isn't designed for that but it appears that under the hood, the command we specify is actually getting executed after the agent starts.

I've tried playing around with yaml multiline strings and here-docs but that seems to cause some weirdness with tokenization and seems like every word gets run as its own command. Another option I thought of was to copy in a script file and just execute that, but it's a little convoulted.

It would be awesome if we could just replace command with commands similar to the native step (https://buildkite.com/docs/pipelines/command-step#command-step-attributes) and provide a list of commands to run. Would that make sense or am I missing some obvious way to do this?

triarius commented 6 months ago

@bpoland thanks for raising this, we think it is a gap between how agent-stack-k8s parses pipelines yaml compared with the builkite backend. We can take a look at handling multiple commands in a command step too.

bpoland commented 6 months ago

OK thanks! I have been able to get it working like this in the mean time. It's a little weird but it seems to work 🤷

                command: ["/bin/bash", "<<EOF", "\n"]
                args:
                  - |
                    echo "one command"
                    echo "another command"
                    echo "and a third"
                    EOF

I definitely think supporting commands: would be nicer though!

bpoland commented 6 months ago

I take it back, have been seeing some issues with this approach where some of the later commands get dropped :(

Not quite sure what's happening, still playing around with it.

triarius commented 6 months ago

@bpoland have you tried using a format similar to what's in one of the sample pipelines: https://github.com/buildkite/agent-stack-k8s?tab=readme-ov-file#sample-buildkite-pipelines

There are also a few samples in the tests: https://github.com/buildkite/agent-stack-k8s/blob/main/internal/integration/fixtures/artifact-upload-failed-job.yaml

In particular, I would try bash -c to pass the script as an argument rather than using a heredoc.

I think the key thing to remember that shellisms like input redirection don't work in kubernetes. All you can do is run a command with some arguments and environment variables. So the <<EOF in your command is probably not doing what you expect.

bpoland commented 6 months ago

@bpoland have you tried using a format similar to what's in one of the sample pipelines: https://github.com/buildkite/agent-stack-k8s?tab=readme-ov-file#sample-buildkite-pipelines

There are also a few samples in the tests: https://github.com/buildkite/agent-stack-k8s/blob/main/internal/integration/fixtures/artifact-upload-failed-job.yaml

In particular, I would try bash -c to pass the script as an argument rather than using a heredoc.

I think the key thing to remember that shellisms like input redirection don't work in kubernetes. All you can do is run a command with some arguments and environment variables. So the <<EOF in your command is probably not doing what you expect.

Thanks, yeah I tried that initially but it seemed like bash was trying to run each word as a separate command maybe. For example when I use something like this:

command: ["/bin/bash", "-c"]
  args:
    - |-
      buildkite-agent annotate "This is a test annotation"
      echo "another thing"

Then when I run it I get a buildkite-agent usage message as though the arguments weren't passed to it properly :(

image
bpoland commented 6 months ago

OK I'm trying this and it seems to be working so far but it's a little annoying with the semicolons haha

command: [bash, -xc]
  args:
    - |-
      '
      buildkite-agent annotate "This is a test annotation";
      echo "another thing";
      '

Also just wanted to circle back to what you said shellisms like input redirection don't work in kubernetes and try to understand a little better -- from what I can tell in the output, the command/args that I specify in the yaml aren't actually being run as the command/args of the pod that gets started right? The container-0 in the pod has a kubernetes command of /workspace/buildkite-agent and then it looks like the agent is running the command/args from my yaml after the pod starts. So I guess you're saying that the way the agent executes my command/args doesn't support input redirection?

It's still confusing to me that the input redirection sorta worked, sometimes 😕

triarius commented 6 months ago

@bpoland You're right that this is confusing. Looks like I confused myself too, so apologies for that.

I was operating on the assumption that agent-stack-k8s mimics the behaviour of command and args for k8s pods, so anything you can do with a pod, you can do in a Buildkite job. AFAIK, this was a design goal, I'm now leaning towards the position that we can't reasonably implement it.

Let me clarify with an example. Without using agent-stack-k8s, if you create a YAML file like this:

# my-pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: my-pod
spec:
  containers:
  - name: my-container
    image: alpine:latest
    command: [ash, -xc]
    args:
    - |-
      echo "Hello, Kubernetes!" > test.txt
      ls -lA test.txt
      false
      cat test.txt

and run it in a k8s cluster with

kubectl apply -f my-pod.yaml

You should be able to observe that each line is run in sequence, which is what I expect if the ash command receives 2 arguments: -xc and a multi-line argument:

echo "Hello, Kubernetes!" > test.txt
ls -lA test.txt
false
cat test.txt

Indeed, when I tested this and ran

kubectl logs -f pod/my-pod

the output was:

+ echo 'Hello, Kubernetes!'
+ ls -lA test.txt
+ false
+ cat test.txt
-rw-r--r--    1 root     root            19 Mar 12 12:17 test.txt
Hello, Kubernetes!

The test fixtures I linked previously had a similar syntax and as the tests behaved as expected, I thought I had verified that agents-stack-k8s behaved in the same way. But now it seems that the tests were accidentally passing, and the behaviour is a little different,

Looking deeper into the code, I now see that we do some mangling of the command and args into a single variable, BUILDKITE_COMMAND and we override the command and args of the container to run the buildkite-agent.

The agent itself will run the contents of BUILDKITE_COMMAND as an argument to a shell command that is based on the BUILDKITE_SHELL environment variable. On agent-stack-k8s, this is set to /bin/sh -ec.

I think we can spend some more time finding a precise explanation to the usual behaviour you've observed, or we can use this knowledge to conclude that the setting the command in the agent-stack-k8s podSpec to [bash, -c] or something similar is redundant. Instead, we can just try something like this:

agents:
  queue: kubernetes
steps:
- label: Test command without args
  plugins:
  - kubernetes:
      podSpec:
        containers:
        - image: alpine:latest
          command:
          - |-
            buildkite-agent annotate "This is a test annotation"
            echo "another thing" | tee artifact.txt
            buildkite-agent artifact upload artifact.txt

This seems to work for me: Screenshot 2024-03-13 at 00-04-25 K8s Private #55 · Nepa's Test Org

triarius commented 6 months ago

Now, back to the original question of allowing the usual Buildkite Pipeline YAML syntax where a command can sometimes be a YAML list where each element is run in sequence:

step:
- command:
  - buildkite-agent annotate "This is a test annotation"
  - echo "another thing" | tee artifact.txt
  - buildkite-agent artifact upload artifact.txt

I think there is going to be a some confusion between this and the expectation in Kubernetes that command is always a YAML list where the first element is an executable and the next are arguments to that executable.

I think, in agent-stack-k8s, you can achieve running multiple commands chained together, as demonstrated at the end of my previous comment. I'm hesitant to introduce two different ways of doing the same thing.

Let me know if your ask is to be consistent with the non-k8s behaviour, or to just get multiple commands working in a single step. If it's just the latter, I should just fix up the documentation.

bpoland commented 6 months ago

Oh wow, I didn't think to try just setting command haha. Yes my goal is just to be able to run multiple commands without too much boilerplate and this does that. Yes I think making it more clear in the documentation that this is an option (and likely calling out the difference from the "standard" Buildkite yaml) would be good.

I guess the only downside to this approach is that each command is not grouped separately in the output, making it a little more difficult to tell where the output from one ends and the next one starts (depending on the commands being run obviously). I tried using the bash -x like you did but I wonder if this is possible with the command way? Also it seems like combining that with the +++ or --- echos for command grouping behaves a little weird (where the + echo to create the group gets printed in the group right before). Do you think it would be possible for the agent to detect newlines and automatically group each one in the Buildkite UI?

DrJosh9000 commented 1 week ago

Coming back to the original request (multiple commands), #371 will change podSpec/command interpretation to make multiple commands the default in the next release.

bpoland commented 1 week ago

Coming back to the original request (multiple commands), #371 will change podSpec/command interpretation to make multiple commands the default in the next release.

Thanks, yes that does look like it will solve this problem -- excited to try it out!