non-container runner - Githubissues

lispyclouds commented 1 year ago

Now that we can have different types of runners, would be nice to have a runner which implements the same pipeline spec without containers. Should be useful for non container OSes like MacOS and other special needs. Probably is better of not bundled but as an opt-in thing maintained my the bob-cd org?

lispyclouds commented 1 year ago

Implements the step as a raw fork exec.

lispyclouds commented 1 year ago

this would enable generalised executions and specially enable #115

Vaelatern commented 1 year ago

It would also allow scheduling jobs on top of Hashicorp Nomad for scaling your build cluster easily.

Vaelatern commented 1 year ago

I've read the runner/README.md and don't see what API I should conform to as a runner.

Ideally I'd write a runner that sits on top of the hashistack for easy integration there.

lispyclouds commented 1 year ago

Hey @Vaelatern thanks a lot for the interest! 😄

I must admit that I pretty much work on Bob in free time and haven't really done much marketing on it! Hence the lack of serious usage and better docs. Will take this as a chance to improve it!

Having said that, I'm really glad you're interested in getting the hashistack/nomad runner wired up and I think it would be an awesome addition!

Here are some links that could take you in the right direction:

Bob's architechture, see the section The Execution Model to have an idea of the steps it takes.
The Pipeline

Essentially, for a runner the following needs to be fulfilled:

Needs to be able to work with RabbitMQ
Should declare a queue bob.nomad.jobs. The current one declares bob.container.jobs. The idea is when a pipeline runs, it can declare a dependency on the runner type which is the second segment in the name
Should declare a queue bob.broadcasts.<unique-id>. this is to receive broadcast messages like stop, pause etc which are runner specific.
Bind the jobs queue to the bob.direct exchange and the broadcasts queue to the bob.fanout exchange.
It should implement the Pipeline spec as described above and read the pipeline definitions from XTDB
It should log its progress and pipeline status to XTDB

Here are some links with point to the implementation in the current runner for reference:

The queue setup
Common config for the queue and db connections.
Logging to XT
Clojure specs for all of the data and commands that passes between boundaries in the Bob cluster including the pipeline spec
Implementation of the Pipeline spec

This as you can see is a bit non-trivial but I for me this seems necessary to achieve enough decoupling for scale. I'm happy to give more info and anything else that's needed and will take in things from this to improve docs! Thanks again!

lispyclouds commented 1 year ago

Also I'm totally up for revisiting the pipeline spec and the API expectations! If some things don't make sense in some contexts we should address it and I think such scenarios might just come up. Happy to explore recommendations!

Vaelatern commented 1 year ago

So my biggest issue with this right now is that nomad runs jobs... if we are to do a single step in a pipeline, we'd need to issue a job to nomad using some container like kaniko and then run the line and push to a registry....

Kaniko is best given just a Dockerfile and building that. That would map very well to nomad jobs: one build, one job.

But this too leads to a problem where Pipelines basically become one build step: run kaniko (with possible other steps for other resources and artifacts...)

Thoughts on this abstraction mismatch?

lispyclouds commented 1 year ago

Disclaimer: I'm not too familiar with Nomad internals and this is me trying to understand your comment more.

The way I am imagining this is that given Bob's pipeline spec, the runner can translate it to a Nomad job spec. I would say that not using docker but an fork/exec driver could be a better starting point as each step in the pipeline could be mapped to a Nomad task in that job. So not very sure why Kaniko would be relevant here?

The reason I'm saying this as Nomad provides an equally good and similar interface for containerzed jobs vs others and since the container ones have been taken care of by the existing runner, the other one could be of more relevance; opening up more avenues like discussed above. This runner could do things like attach to the submitted job and filter out Allocation and other events and stream things back to the DB.

As for implementing the pipeline spec, the image attribute could be completely ignored by the Nomad runner and it could implement the rest of the steps, resources and artifacts. They are all defined in agnostic terms via HTTP/REST interfaces.

Hopefully I'm understanding your perspective to some degree and lemme know of your thoughts!

lispyclouds commented 1 year ago

Also this is giving me an idea that if this runner works out we might as well use Nomad for everything, ditch the other runner? 😅

Vaelatern commented 1 year ago

One of the best notions of modern CI is the build doesn't happen "on the box" but in a controlled environment. You need git available? Get a base container with git. Python? Java? Ocaml? Rust? Just grab a container that has what you need. Benefits include security (with isolation), consistency (with the same base used every time), and matching workflows on every platform.

Nomad tasks do not have a complicated pipeline story like Bob's pipelines. There are 3 buckets: Pre, Run, Post. A pipeline would put resource fetching in Pre, artifact uploading in Post, and the step in Run.

That means each step gets its own nomad job, and bob submits them in series depending on the result of that step.

By the way, Bob has an "implicit artifact" from each stage: the container after commit was called.

Ok, here is a question. The above is a lot to process, but here's a simpler question:

Let's say I want to produce an OCI image in my CI pipeline, so I can deploy it later in dev then stage then prod (let's say). What would the bob pipeline (as of today) look like to do that?

Vaelatern commented 1 year ago

Ok here's a crazy suggestion.

Buildah used as a library and also a runner built in Go as a nomad plugin.

Would require the runner-bob link be open enough to make this work. The actual runner would likely be a daemon that sits and submits jobs to nomad using that runner.

What I'm saying is, with the simpler question I asked above answered (how would you produce an OCI image as a bob artifact today), I think this could be the first CI/CD solution that actually works natively on nomad.

lispyclouds commented 1 year ago

Sorry, this took me a while to get to, been quite tied up in things!

Let's say I want to produce an OCI image in my CI pipeline, so I can deploy it later in dev then stage then prod (let's say). What would the bob pipeline (as of today) look like to do that?

The way to do this in the current setup would be:

use the relevant container image with the tools you need like you said
produce the artifact
install a userland OCI image builder like Kaniko or Buildah or use another pipeline with that as a base image and use the artifact from this as an internal artifact
build and push the image to the registry

Using a userland/non-root builders are recommended as you said as well, all builds are isolated.

That means each step gets its own nomad job, and bob submits them in series depending on the result of that step.

That seems to map really well with the structure here! I think it should work. Would be excited to see an implementation!

By the way, Bob has an "implicit artifact" from each stage: the container after commit was called.

Yes, but they are also garbage collected along with the containers that they are committed from:

This is done by design to leave the build env like before.

bob-cd / bob

non-container runner #113