Closed dasJ closed 2 years ago
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/zero-hydra-failures-22-05/19051/1
Perhaps we should tag PR's that fix hydra failures with 0.kind: build failure
to encourage reviewing? https://github.com/NixOS/nixpkgs/pulls?q=is%3Aopen+is%3Apr+label%3A%220.kind%3A+build+failure%22
Edit by @dasJ: I added the instruction into the the the issue description.
Some packages I maintain have on Hydra the errors
OSError: Too many open files
But they build ok on nixpkgs master locally.
Example: https://hydra.nixos.org/build/175654425/nixlog/1/tail
Not sure of what I can do on my end.
I restarted some, but that scipy build failed many times in a row so there it doesn't seem to make sense. I'd suggest to try skipping tests that do similar problems. EDIT: https://github.com/NixOS/nixpkgs/issues/170143
For Haskell, please remember to target any PRs to the haskell-updates
branch! Edit by @dasJ: I added this hint to the issue description.
Since we've already marked (most) failures as broken, you need to check manually if your favorite package still works, instead of looking at failed builds on Hydra.
Additionally here is a list of more prominent problems (of Hakell packages exposed via top level pkgs
) to look into, note that some of these are unmaintained and probably not worth fixing / should be removed in the long run.
hyper-haskell
(a problem here is also the electron version used)jl
(https://github.com/NixOS/nixpkgs/issues/168256)hasura-graphql-engine
, mostly blocked on upstream)krank
cedille
diagrams-builder
glirc
hedgewars
(exception: fix can go to master)icepeak
madlang
nix-delegate
nix-deploy
stack2nix
stutter
tweet-hs
Most Octave packages are broken because of a change in the way Octave handles it packages. #168943 has further discussion of this issue.
https://zh.fail/ is cool but I think we may be needing a logarithmic y-axis before long...
I've created an upstream PR for jl
.
a logarithmic y-axis
That won't work, when we get to 0 the graph will be at negative infinity. We would need a symmetric log plot.
If/when we get to 0 I'm perfectly happy for the graph to explode, in fact it would be a fitting celebration.
staging-next merged now.
looks like the pandas fix (#173177) missed that staging-next merge? does that mean it wont make release?
looks like the pandas fix (#173177) missed that staging-next merge? does that mean it wont make release?
As a trivial fix to a few darwin non-builds with no linux rebuilds, it should have gone straight to master.
If I'm reading the release schedule correctly, there's still a staging-next iteration left before branch-off (assuming we're late on the schedule and not early).
jl
is now fixed (#168256)
tinygo should be fixed with #157129
Filed upstream bug as https://sourceware.org/PR29162 for gnat
/ glibc
incompatibility.
I don’t see a filter option on the hydra page. How can I filter for failed jobs on my system?
@schuelermine: that's the "search jobs by name" field. (you could directly edit the URL, too)
What’s the syntax for filters?
None AFAIK. Contiguous substring, or how would I call the matching.
Oh, that’s unintuitive. I would expect “search jobs by name” to search by name only
release-22.05
has been branched off so remember to also add the backport release-22.05
to your Pull Requests :tada:
I've kinda put down a list of packages broken with stdenv update #zhfff https://gist.github.com/cab404/96259f25450d778e744108c0ea9bfaa8 it’s parsed from hydra outputs with smth like that
[ ...(document.querySelector("#tabs-now-fail > table:nth-child(1) > tbody:nth-child(2)").children) ]
.filter((e) => e.getElementsByClassName("build-status")[0].attributes["alt"].value === "Failed" )
.filter((e) => e.children[5].textContent === "x86_64-linux")
.map((r) => r.children[2].textContent)
these only include ones broken in this eval (1756238) and still broken in this (1763443)
Mission
Every time we branch off a release we stabilize the release branch. Our goal here is to get as little as possible jobs failing on the trunk/master jobsets. We call this effort "Zero Hydra Failure". I'd like to heighten, while it's great to focus on zero as our goal, it's essentially to have all deliverables that worked in the previous release work here also.
Please note the changes included in RFC 85.
Most significantly, branch off will occur on 2022 May 22; prior to that date, ZHF will be conducted on master; after that date, ZHF will be conducted on the release channel using a backport workflow similar to previous ZHFs.
Jobsets
trunk Jobset (includes linux, darwin, and aarch64-linux builds) nixos/combined Jobset (includes many nixos tests)
How to help (textual)
Select an evaluation of the trunk jobset
Find a failed job ❌️ , you can use the filter field to scope packages to your platform, or search for packages that are relevant to you. Note: you can filter for architecture by filtering for it, eg: https://hydra.nixos.org/eval/1719540?filter=x86_64-linux&compare=1719463&full=#tabs-still-fail
Search to see if a PR is not already open for the package. It there is one, please help review it.
If there is no open PR, troubleshoot why it's failing and fix it.
Create a Pull Request with the fix targeting master, wait for it to be merged. If your PR causes around 500+ rebuilds, it's preferred to target
staging
to avoid compute and storage churn. If your PR is fixing Haskell packages, target thehaskell-updates
branch instead.(after 2022 May 22) Please follow backporting steps and target the
release-22.05
branch if the original PR landed inmaster
orstaging-22.05
if the PR landed instaging
. Be sure to dogit cherry-pick -x <rev>
on the commits that landed in unstable. @jonringer created a video covering the backport process.Always reference this issue in the body of your PR:
Please ping @NixOS/nixos-release-managers on the PR and add the
0.kind: build failure
label to the pull request. If you're unable to because you're not a member of the NixOS org please ping @dasJ, @tomberek, @jonringer, @Mic92How can I easily check packages that I maintain?
I have created an experimental website that automatically crawls Hydra and lists packages by maintainer and lists the most important dependencies (failing packages with the most dependants). You can reach it here: https://zh.fail
If you prefer the command-line way, you can also check failing packages that you maintain by running:
New to nixpkgs?
Packages that don't get fixed
The remaining packages will be marked as broken before the release (on the failing platforms). You can do this like:
Closing
This is a great way to help NixOS, and it is a great time for new contributors to start their nixpkgs adventure. :partying_face:
As with the feature freeze issue, please keep discussion here to a minimal so you don't ping all maintainers (although relevant comments can of course be added here if they are directly ZHF-related) and ping me or the release managers team in the respective issues.
cc @NixOS/nixpkgs-committers @NixOS/nixpkgs-maintainers @NixOS/release-engineers
Related Issues