NixOS / hydra

Hydra, the Nix-based continuous build system
http://nixos.org/hydra
GNU General Public License v3.0
1.17k stars 300 forks source link

Hydra doesn't build queued jobs, episode 4 #568

Open pikajude opened 6 years ago

pikajude commented 6 years ago

My private Hydra server shows 7 queued jobs, all of which are listed as x86_64-linux, none of which are running (queued 13 hours ago).

Based on reading previous issues along these lines, I tried creating nix.buildMachines in my system configuration and specifying supportedFeatures. While this does make localhost show up on the Hydra "Build machines" admin page, it doesn't affect builds running or not.

I'm using https://github.com/nixos/nixpkgs/commit/3688ab8f5d524cd54fd3f4e3d820d82f77f2940d without any hydra-specific overrides in my nixops system expression. My system is running nix-daemon with nix 2.0.4 and hydra-2017-11-21.

Here's the log output from hydra-queue-runner:

Jun 28 12:35:28 web hydra-queue-runner[2391]: loading build 521 (participle:mvc:build.ghc802.x86_64-linux)
Jun 28 12:35:28 web hydra-queue-runner[2391]: loading build 522 (participle:mvc:build.ghc822.x86_64-linux)
Jun 28 12:35:28 web hydra-queue-runner[2391]: loading build 523 (jude-web:master:build.ghc822.x86_64-linux)
Jun 28 12:35:28 web hydra-queue-runner[2391]: loading build 524 (jude-web:master:build.ghc843.x86_64-linux)
Jun 28 12:35:28 web hydra-queue-runner[2391]: loading build 525 (jude-web:master:build.ghc802.x86_64-linux)
Jun 28 12:40:28 web hydra-queue-runner[2391]: status: {"status":"up","time":1530214828,"uptime":300,"pid":2391,"nrQueuedBuilds":7,"nrUnfinishedSteps":518,"nrRunnableSteps":50,"nrActiveSteps":0,"nrStepsBuilding":0,"nrStepsCopyingTo":0,"nrStepsCopyingFrom":0,"nrStepsWaiting":0,"bytesSent":0,"bytesReceived":0,"nrBuildsRead":7,"buildReadTimeMs":613,"buildReadTimeAvgMs":87.5714,"nrBuildsDone":0,"nrStepsStarted":0,"nrStepsDone":0,"nrRetries":0,"maxNrRetries":0,"nrQueueWakeups":0,"nrDispatcherWakeups":48,"dispatchTimeMs":0,"dispatchTimeAvgMs":0,"nrDbConnections":2,"nrActiveDbUpdates":0,"memoryTokensInUse":0,"nrNotificationsDone":0,"nrNotificationsFailed":0,"nrNotificationsInProgress":0,"nrNotificationsPending":0,"nrNotificationTimeMs":0,"nrNotificationTimeAvgMs":0,"machines":{"localhost":{"enabled":true,"currentJobs":0,"idleSince":0,"nrStepsDone":0,"disabledUntil":0,"lastFailure":0,"consecutiveFailures":0}},"jobsets":{"jude-web:master":{"shareUsed":0,"seconds":0},"participle:.jobsets":{"shareUsed":0,"seconds":0},"participle:mvc":{"shareUsed":2.5,"seconds":5}},"machineTypes":{"builtin:local":{"runnable":50,"running":0,"waitTime":14950,"lastActive":0}},"store":{"narInfoRead":0,"narInfoReadAverted":0,"narInfoMissing":0,"narInfoWrite":0,"narInfoCacheSize":0,"narRead":0,"narReadBytes":0,"narReadCompressedBytes":0,"narWrite":0,"narWriteAverted":0,"narWriteBytes":0,"narWriteCompressedBytes":0,"narWriteCompressionTimeMs":0,"narCompressionSavings":0,"narCompressionSpeed":0}}

What's the next step to debug this?

dtzWill commented 6 years ago

This isn't really a solution but have you tried restarting hydra-queue-runner (systemctl restart hydra-queue-runner)? For whatever reason I need to do this every now and then when it gets "stuck". Attempts at isolating what causes such a state haven't panned out, unfortunately. Might not help but if you haven't tried it you definitely should :).

pikajude commented 6 years ago

Nope, restarting hydra-queue-runner doesn't do anything. The original post has the output that I see every time I restart it.

kquick commented 6 years ago

Try adding the following supportedFeatures line in your hydra master's /etc/nixos/configuration.nix:

nix.buildMachines = [
   { hostName = "localhost";
     systems = ["builtin" "x86_64-linux" "i686-linux"];
     supportedFeatures = [ "nixos-test" "benchmark" ];
   };
];
pikajude commented 6 years ago

Looks like adding builtin to the systems line was actually what fixed it here. Do you know why that's a requirement?

kquick commented 6 years ago

Not precisely. There were days of mystery, much searching and nixos-rebuilding, and eventually I stumbled on the above. :-) I can't remember the details, but this is an area that could use a little more illumination in hydra, as echoed by some of the other issues. I'm glad this was helpful to you as well.

BTW, you may be interested in my (currently underdocumented) vernix tool (https://github.com/kquick/vernix) which has a --hydra option that generates the hydra declarative configuration information. If you do try it out, I'd be appreciative of any feedback.

bjornfor commented 6 years ago

Looks like adding builtin to the systems line was actually what fixed it here.

Same here. So far I've been using system = "x86_64-linux";, since that's what's documented in man configuration.nix under "nix.buildMachines". But since a few months (I think) my hydra has been broken due to this apparent need for systems = [ "builtin" ... ] thing, which I had no idea about. (I tried to debug it once, but gave up. Too many bugs, too little time.)

sorki commented 6 years ago

Already fixed by 5a1f2a5, I suggest running latest hydra if possible.

pikajude commented 6 years ago

@sorki I'd love to be able to, but generally the Hydra release that makes it into nixpkgs is the only one that ever works reliably. About half the time it fails to compile with the version of nix that's in my nixpkgs and the other half of the time the server itself doesn't run (this time specifically, there's a Perl package missing from the release build)