numtide / nar-serve

Unpack and serve NAR file content on the fly
Apache License 2.0
30 stars 8 forks source link

Volume plugin for container orchestrators? #15

Open mikepurvis opened 3 years ago

mikepurvis commented 3 years ago

Is your feature request related to a problem? Please describe.

My situation is that I build a system that totals around 5-10GB (currently with debs, but transitioning to Nix) and then run an extensive simulation-based testing framework on that system. Regardless of what I do about build (eg, moving to Hydra), I'd like to be able to continue using Jenkins and Kubernetes workers for the testing aspect of it.

Some "conventional" options could be:

Both of these options potentially incur a lot of unnecessary transfer and archive manipulation and don't really leverage the power of Nix.

Describe the solution you'd like

This may be well out of scope for this project, but it would be super cool if it was able to act as a Kubernetes volume plugin and transparently fill requests for /nix/store paths from a cache that was shared between all containers on the host:

https://github.com/kubernetes/community/blob/master/sig-storage/volume-plugin-faq.md

This could be set up as an overlay so that writes would always be container-local, but reads would fall through to the volume.

Describe alternatives you've considered

See above.

nixinator commented 2 years ago

If you want to be rid of that java butler forever and your switching to nix, take a look at the nix native Hercules CI. You might be able to get all the tests you need with effects.

However you mileage will vary...

in a world with no master, we require no slaves (butlers). ;-)

zimbatm commented 2 years ago

It's a cool idea, too bad it took me a year to see this issue. @mikepurvis it's probably too late. If you're interested it's something we could actually build for you.

mikepurvis commented 2 years ago

@zimbatm Not too late at all!

@nixinator Hercules CI (while cool!) is not a good fit for us because we want to be able to run on-prem and we need builds that aren't tied to Github repos— in fact, Hercules seems to double-down on what is currently our biggest frustration with Hydra, which is that it imposes an unnecessary projects/jobsets/evals flowdown on everything when all we really want is a dirt-simple API to send pre-locked flakes to, and then get back a hash URL for whatever it built from it, what the steps were, and what all their logs looked like.

Anyway, currently we're making do with pre-cooked containers on Jenkins, but I don't love it. It would still be totally awesome to have a first-class story for running Nix environments inside Kubernetes, and a huge piece of that has got to be a better story for managing the store.

zimbatm commented 2 years ago

It reminds me a bit about https://nixery.dev/. The author is actually working with us and is something we could provide consulting for. Even if it's just to chat, happy to have a call: https://numtide.com/contact

mikepurvis commented 2 years ago

Yeah! I had seen Nixery on HN this morning, and I guess that kind of addresses the same underlying need from the opposite direction— instead of bringing a native Nix store capability to Kubernetes, adapt Nix store paths (or pools of them) to be stored and composed as OCI container layers. This definitely has the benefit of being able to be used out of the box with all standard container tooling, and in particular working with container stores like Harbor.

Both approaches have merit, though I think one big drawback of the Nixery approach is that you have to know upfront what all you're going to want to have in your container environment. As soon as you start running that container (say, for a CI job), and do your first nix build operation, it'll be back to querying the remote binary cache, pulling NARs of dependencies over the wire to be unpacked locally, etc.

zimbatm commented 2 years ago

We're also exploring things like https://github.com/flokli/nix-casync that would minimize synching deltas.

It would be nice if Docker and Nix could share the same underlying storage mechanism so that there is only one garbage collection and one way of pulling things.

These are all ideas that we're discussing internally (the CSI volume is new though) but need more funds and time to be able to tackle.