typhon-ci / typhon

Nix-based continuous integration
https://typhon-ci.org/
GNU Affero General Public License v3.0
207 stars 6 forks source link

Support remote-building #47

Open maralorn opened 8 months ago

maralorn commented 8 months ago

Even in moderately small-scale setups building everything on the machine hosting typhon might not be desirable.

By default nix build has builtin support for distributing builds to remote builders. I hope enabling this feature in typhon should be relatively straight forward.

adamcstephens commented 8 months ago

Nix distributed builds would be a good improvement indeed. Though I consider the native capability quite inefficient (lots of extra copying) and not very resilient.

It would be much more involved for typhon to handle this on its own with a dedicated agent, but I think it may provide for a better experience in the end.

pnmadelaine commented 7 months ago

@adamcstephens we are in a constant state of debate with @W95Psp on the matter of having a dedicated agent for Nix builds!

Currently we actually do have one, which computes the dependency graph of every job and calls nix build individually on every node: https://github.com/typhon-ci/typhon/blob/main/typhon-core/src/build_manager.rs

This is because it is the only way I found around a problem I have with the Nix CLI: a failure in any dependency of a build will result in the cancelation of all dependencies, even if other builds are waiting for them too. This can be frustrating if sereval builds depend on the same derivation, but one of them also depends on a failing derivation, resulting in the working one potentially being built several times.

The current code for the agent is arguebly very ugly, and is certainly one of your bottlenecks. Problems with the native remote building capabilities could be an argument to commit to our own agent, and look into making it more efficient and more resilient.

adamcstephens commented 7 months ago

Have you tried using the --keep-going CLI flag to allow nix to process as much as it can even in the case of one branch of builds failing?

pnmadelaine commented 7 months ago

That would lead to jobs taking more time to fail, and dependencies that are not required by any succeeding job to be built anyway. A dedicated agent allows for finer control, where a derivation keeps building until no job requires it anymore. Maybe this is unnecessary, but I like it :)