NixOS / SC-election-2024

2024 Election for the Steering Committee
31 stars 75 forks source link

What your thoughs about a NixOS-like thing for clusters? #117

Open lucasew opened 1 month ago

lucasew commented 1 month ago

Question

A issue I see with NixOS is that NixOS servers treat machines as pets, which is often normal and desired, but limits the potential to use Nix in clusters or grids of machines. Yeah, there are ways like using NixOS as a platform to run some clustering software such as Nomad or Kubernetes but then you lose most of the cool stuff NixOS has. One can still build the stuff that will be run with Nix to OCI containers but would need to manage and define services using some form of YAML, or HCL. Nix already has stuff for incremental copying of artifacts using nix-copy-closure and binary caches but this is lost when you have to build a OCI container each time you iterate in a service to be run.

BTW systemd has already lots of primitives to be used for this. I think the biggest challenge here would be to unify networking between services.

What are your thoughs, and vision, about this possible line of work?

Candidates I'd like to get an answer from

No response

Reminder of the Q&A rules

Please adhere to the Q&A guidelines and rules

cafkafk commented 1 month ago

This is something I really want to see. Like REALLY. I also know it would be incredibly hard. I'm currently gently dogfooding some very primitive concepts for potentially creating some sort of nix-rebuild for a "NixOS cluster", but those are efforts I consider outside of the scope of the steering committee.

Also purely on the technical side of this:

One can still build the stuff that will be run with Nix to OCI containers but would need to manage and define services using some form of YAML, or HCL

The way we're currently doing it at DBC is that we have 3 datacenters with NixOS hosts that serve as container nodes and control planes, and we specify manifests with our tool kubernixos in the nix languag, and docker container images with the nix language again, served by another tool wharfix, which mimics a docker registry, making nix expressions available as images, and deployed with the Nix deployment tool morph.

Personally, I could imagine a future where we or another company would use microvm.nix to create hosts as a replacement for HCL (this is also something I'm very slowly and humbly working on dogfooding privately), and by combining the creation of VMs, deployment of hosts (with e.g. iPXE and diskos and a bit of magic), and NixOS containers with something like wharfix and kubernixos, a deployment tool that could deploy an entire kubernetes cluster to hybrid-cloud platforms could exist. And hey, perhaps one day such a tool could absorb the orchestration aspects of such a cluster, replacing kubectl as well as maybe... one day... kubernetes...

But as a steering committee member, these wouldn't be areas of focus, I think governance issues are much more pressing, and technical developments like these much better left to bottoms up efforts of those pushing the commits, and the steering committee would best serve its role as a guiding and mediating force, as well as one that makes technical assistance available in cases of disputes and for concerns with integration into the wider ecosystem.

mschwaig commented 1 month ago

I agree with @cafkafk, I also see this as an issue that is outside the scope of the steering committee.

Such deeply technical work can only be done in the community and not on the SC level, as the SC should focus on governance. The SC can start looking at things like this once they have already gained support in the community.

I'm happy to see people working on these kinds of issues, though.

tomberek commented 1 month ago

I like the idea. I suspect it would take the form of congruence at the infra-layer and convergence at the system layer. There are some experiments along these lines already that people should be familiar with, especially the Disnix family of tools: https://github.com/svanderburg/disnix

From what I can gather, I think the proposed system could be implemented in the NixOps4 work-in-progress.

roberth commented 1 month ago

This is a rather technical matter, but I like it, and I believe this can and should be done when we have better CI and testing infrastructure to pick and choose NixOS module sets in more than the (more or less) one traditional way.

Perhaps the SC could serve a small role by polling interest and connecting people, but ideally we enable everyone to organize these things themselves instead; make that really easy.

proofconstruction commented 1 month ago

This is an important concern. Some of us run Kubernetes on NixOS hosts for some workloads, but this necessarily means dealing with bad formats like YAML. Projects like kubenix are attempting to address this, but there are very few industrial cluster operators in the NixOS community, so the adoption pace is glacial at best. I have some (unpublished, sorry) work in this direction but frankly it's a very low priority currently. K8s-on-NixOS marries the NixOS we love with the k8s the broader industry is familiar with, so it's Good Enough For Now.