Closed jonringer closed 4 years ago
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/nixos-20-09-zero-hydra-failures/8928/1
It can also be useful to refer to https://hydra.nixos.org/eval/1611944, a master
evaluation taken around the time of the branchoff, which will have a much longer history you can use to dig back to a package's last successful build and get an indication of the changes which might have caused it to break.
Also, people can use @samueldr 's nix-review-tools to create a report which will show which packages are causing the most failures.
I usually do something like:
mkdir cache
cd cache
../eval-report $evaluation_id > index.html
xdg-open index.html
The nearest NixOS:trunk-combined Hydra evaluation is 1611942 at commit 6152513, 6 commits after the last shared commit 53ce0bf.
However, our darwin story isn't as good [...]
It's blown up by ghc timing out. That has been occasionally happening and it should go away after some restart(s).
@vcunat should we pass "big-parallel" for the ghc build so that it has more cores?
Since last time this was useful to many, here's one way to build all packages which have you as maintainer:
master
or release-20.09
so you have the latest build.nix mentioned belownix-build maintainers/scripts/build.nix --argstr maintainer fgaz
(adjust the maintainer name to yourself)We cannot ping @NixOS/nixos-release-managers as required in the issue if we’re not part of the organization, is it possible to fix that?
I know there's a more limited number of people who use nix on macOS, so I'd like to focus my efforts there, is there a way of getting a list of jobs that are succeeding on nixpkgs/staging-20.09
and failing on nixpkgs/nixpkgs-20.09-darwin
?
@uri-canva that is possible by opening an evaluation of nixpkgs/nixpkgs-20.09-darwin
and the you have to hit the Compare to button and select nixpkgs/staging-20.09
. All "Newly failing" builds are fine one staging-20.09
, but fail on nixpkgs-20.09-darwin
.
However it should be sufficient to filter for jobs matching x86_64-darwin
in Search jobs by name
, right?
Yes, but that would also show derivations that are failing for the same reason on linux and darwin, and would be fixed by fixing the linux derivations anyway.
Is there any way to get a list of my (currently broken) derivations like in some previous ZHF sprints?
@makefu you could try to build them with the command i posted in this thread
As an aside, I wish we could opt-in to failed build notification emails from hydra. I didn't know some of my packages were broken :-/
@makefu you could try to build them with the command i posted in this thread
I actually tried that but the build it looks like the process fails pre-build. I am now in the progress of trying to fix all evaluation errors
~/nixpkgs git reset --hard upstream/master
HEAD is now at 5d131d33268 Merge pull request #97540 from danieldk/fix-clpeak
~/nixpkgs nix-build maintainers/scripts/build.nix --argstr maintainer makefu --show-trace --keep-going
error: while evaluating the attribute 'drvPath' at /home/makefu/nixpkgs/lib/customisation.nix:163:7:
while evaluating the attribute 'buildInputs' of the derivation 'python2.7-aresponses-2.0.0' at /home/makefu/nixpkgs/pkgs/development/interpreters/python/mk-python-derivation.nix:108:5:
while evaluating 'getOutput' at /home/makefu/nixpkgs/lib/attrsets.nix:464:23, called from undefined position:
while evaluating anonymous function at /home/makefu/nixpkgs/pkgs/stdenv/generic/make-derivation.nix:143:17, called from undefined position:
while evaluating 'callPackageWith' at /home/makefu/nixpkgs/lib/customisation.nix:117:35, called from /home/makefu/nixpkgs/pkgs/top-level/python-packages.nix:5446:20:
while evaluating 'makeOverridable' at /home/makefu/nixpkgs/lib/customisation.nix:67:24, called from /home/makefu/nixpkgs/lib/customisation.nix:121:8:
while evaluating anonymous function at /home/makefu/nixpkgs/pkgs/development/python-modules/pytest-asyncio/default.nix:1:1, called from /home/makefu/nixpkgs/lib/customisation.nix:69:16:
while evaluating 'makeOverridablePythonPackage' at /home/makefu/nixpkgs/pkgs/top-level/python-packages.nix:36:37, called from /home/makefu/nixpkgs/pkgs/development/python-modules/pytest-asyncio/default.nix:2:1:
while evaluating 'makeOverridable' at /home/makefu/nixpkgs/lib/customisation.nix:67:24, called from /home/makefu/nixpkgs/pkgs/top-level/python-packages.nix:38:12:
while evaluating anonymous function at /home/makefu/nixpkgs/pkgs/development/interpreters/python/mk-python-derivation.nix:31:1, called from /home/makefu/nixpkgs/lib/customisation.nix:69:16:
pytest-asyncio-0.14.0 not supported for interpreter python2.7
I'll be back once i finished that and maybe encountered some actual build errors.
EDIT: is it preferable to pool all the py3k-related evaluation fixes ( disabled = !isPy3k; ) or should i create a PR for each package?
Since last time this was useful to many, here's one way to build all packages which have you as maintainer:
* ~apply #97514~ Pull from master so you have the latest build.nix mentioned below * run `nix-build maintainers/scripts/build.nix --argstr maintainer fgaz` (adjust the maintainer name to yourself)
@fgaz won't this build on master? Don't we want to test the nixos-20.09 branch?
@bbigras well we're trying to fix both! But yes, I see what you mean, and I guess that pr needs a backport. I'll open another pr. edit: done & merged
@makefu I got a fix for you #97571
I ran into the same issue with python packages where unsupported interpreters throw an error
I've tried fixing springLobby but it fails to find libcurl, did somethng change about that around 15th August?
EDIT: seems like a CMAKE bump
Thank you for clearly describing the contrib process for ZHF. I drive-by fixed a package I happened to use, but wouldn't have bothered contributing to ZHF if it wasn't so easy.
I found a few cases in perlPackages builds on Darwin where there's an issue with ld which is fixed by adding "export LD=$CC", which several packages appear to have done already. Some predicate on i686 or Darwin, and some do it unconditionally.
Is it worth opening an issue and trying to find a more general fix for all perlPackages, or should I just do one-off PRs for each one I find like this?
Or maybe do them both so there's at least a fix for 20.09?
@volth , do you know the best way going forward for @treed
- nix-build maintainers/scripts/build.nix --argstr maintainer fgaz
Is there a way to get past evaluation errors if there are some? (I get an error: bcrypt-3.2.0 not supported for interpreter python2.7
)
What is to be done if a package is broken in release-20.09
, but works in master
? My concrete case is kmime
; it was broken by commit ce4eb0b79b3b8e830d40345fb6457fac6ca9a9ec and still is in release-20.09
, but is fine in master
.
What is to be done if a package is broken in
release-20.09
, but works inmaster
? My concrete case iskmime
; it was broken by commit ce4eb0b and still is inrelease-20.09
, but is fine inmaster
.
A backport I would guess.
I'd use nix show-derivation -f. kmime > kmime.$(git describe HEAD).json
when the master and the release branch are checked out, and diff the outputs.
Is there a way to get past evaluation errors if there are some?
@knedlsepp either #97571 or #97647
What is to be done if a package is broken in
release-20.09
, but works inmaster
? My concrete case iskmime
; it was broken by commit ce4eb0b and still is inrelease-20.09
, but is fine inmaster
.
@loewenheim see comments in #97242 for why kmime is failing
@thoughtpolice in january you said (in https://github.com/NixOS/nixpkgs/pull/77985#pullrequestreview-344972610):
I would also be fine with dropping FoundationDB 5.x builds, too, since they're pretty heavy to build and I imagine most people are running 6.x at this point.
Do you think now might be the time to do that?
pythonPackages.gipc
looks like it'll be stuck waiting on https://github.com/jgehrcke/gipc/issues/103
(good thing I tried enabling the tests to discover that)
I was thinking about going through and disabling all the failing python27Packages that have explicitly dropped python 2 upstream, but I see this https://github.com/NixOS/nixpkgs/pull/92099. Is there any point in bothering? @jonringer
I was thinking about going through and disabling all the failing python27Packages that have explicitly dropped python 2 upstream, but I see this #92099. Is there any point in bothering? @jonringer
I've been somewhat doing that while reviewing other packages https://github.com/NixOS/nixpkgs/pulls?q=is%3Apr+author%3Ajonringer+disable+is%3Aclosed+python
The discussion you linked to is a bit more involved.
Not recursing into the attr set is the "easiest" solution, but will not perfect for users that still need some python2 packages
I am a bit late and don't know if this is the right place as I am new to ZHF but I had a look at some darwin packages with the help of nixpkgs-review-tools : |
Name | Count | Builds Locally | Builds on Hydra | Action Needed |
---|---|---|---|---|---|
python38-curio-1.2 | 112 | :heavy_check_mark: | :heavy_check_mark: | :white_check_mark: #98927 | |
qtbase-5.14.2 | 95 | :x: | :x: Cannot find feature sdk | @eqyiel https://github.com/eqyiel/nixpkgs/commit/50c2b5fd030459ff9508f65e9ffdebad0de36a63 has a WIP that removes the usr/bin/xcodebuild impurities. Unfortunately it needs at least the 10.13 sdk. See https://github.com/NixOS/nixpkgs/issues/95199 for more details. @matthewbauer Would now be a good time to upgrade the sdk? Probably that is too much work but is it possible to have multiple apple sdk versions? |
|
qtbase-5.15.0 | 47 | :x: | use of undeclared identifier | not looked at maybe the 5.14.2 actions fix this too | |
python3.7-notebook-6.1.3 | 40 | :heavy_check_mark: | :heavy_check_mark: | :white_check_mark: #98621 | |
python3.8-notebook-6.1.3 | 38 | :heavy_check_mark: | :heavy_check_mark: | :white_check_mark: #98621 | |
python3.8-fsspec-0.7.4 | 27 | :heavy_check_mark: | :x: def test_modified fails | I disabled the one failing test in this backport #98987 for review | |
python3.8-fs-2.4.11 | 18 | :x: | :x: resource tracker cannot free resource | Seems to be a multiprocessing issue with python 3.8.5 on macosx as it works with python37. I opened an issue here https://github.com/PyFilesystem/pyfilesystem2/issues/430 and disabled the tests in this pr https://github.com/NixOS/nixpkgs/pull/98619 which is open for review too. |
@tricktron re curio
, I think it would be perfectly acceptable to disable these two tests on darwin given the impact. It's clearly not an issue with the library.
Also, notebook
backport has been approved.
I also keep finding someone has got ahead of me discovering upstream issues (frequently @jonringer) so I think it might be useful to link to upstream blocker issues here to save people digging down to the same issue.
pythonPackages.localzone
is held up on https://github.com/ags-slc/localzone/issues/1(there's also a chance the author seeing the issue referenced might get spurred into action...)
Samba build is fixed on darwin in 4.13+ by https://gitlab.com/samba-team/devel/samba/-/commit/847208cd8ac68c4c7d1dae63767820db1c69292b which has also been backported in 4.12.6. I also backported the patch to apply cleanly to the current 4.12.5 to avoid rebuilds, but would it be preferable to just update to 4.12.6? (New to ZHF so I don't know what impact that would have)
@tricktron re
curio
, I think it would be perfectly acceptable to disable these two tests on darwin given the impact. It's clearly not an issue with the library.
@risicle Ok pr opened here https://github.com/NixOS/nixpkgs/pull/98875
Also,
notebook
backport has been approved.
Thanks
@r-burns I'd personally go for the bump.
~git-annex-adapter
is stuck waiting on https://github.com/alpernebbi/git-annex-adapter/issues/13 and https://github.com/alpernebbi/git-annex-adapter/issues/14~
~(after getting past the immediate trivial build-stopper by passing cacert
to checkInputs
)~
I can't tell my branches apart.
/me coughs in the direction of #98688
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
PRs still needing review in this thread:
vcunat EDIT: all done now
yea, sorry, mixture of burnout and ck3 has lead me to neglecting this
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/nixos-weekly-08-nixos-weekly/9431/1
@jonringer went through all of open ones and merged most of them :)
Many others also helped, I'll do a proper thanks in the release post :)
The x86_64 build of itk-5.1.1 is broken on Hydra with an "Illegal instruction" error, but building locally succeeds (and Hydra builds master), so I'm not exactly sure how to bisect/troubleshoot this.
The x86_64 build of itk-5.1.1 is broken on Hydra with an "Illegal instruction" error, but building locally succeeds (and Hydra builds master), so I'm not exactly sure how to bisect/troubleshoot this.
"Illegal instruction" means a program uses an instruction that is unsupported by a CPU, Hydra builders have different CPUs so certain instructions will be supported only by some machines.
This happens because ITK is built with -march=corei7
(https://github.com/InsightSoftwareConsortium/ITK/blob/4b48c9025f66d179d7b134999e2398f5924093b4/CMake/ITKSetStandardCompilerFlags.cmake#L245). -march=corei7
means that the program is compiled for CPUs supporting MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1 and SSE4.2 only.
I believe we may want to remove -march=corei7
(because it's not correct for NixOS, currently the distribution doesn't assume any particular CPU) and -mtune=native
(because reproducibility).
What's the plan for theano
? @twhitehead had several suggestions in https://github.com/NixOS/nixpkgs/pull/99516#issuecomment-703289554 but which do we want to take forward for release-20.09
?
Hmm... I quite need #99587 merged to be able to backport it to 20.09, but its dependencies are now broken on master
.
UPDATE after initial 20.09 Release: Fixing broken packages will always be possible throughout the lifetime of the 20.09 release. However, you may need to remove the broken = true; attr on the package. Otherwise please follow normal back-port conventions. :)
Old Post: Jobsets:
Mission
Every time we branch off a release we stabilize the release branch. Our goal here is to get as little as possible jobs failing on the release-20.09 jobsets. I'd like to heighten, while it's great to focus on zero as our goal, it's essentially to have all deliverables that worked in the previous release work here also.
How many failing jobs are there?
At the opening of this issue we have the main
x86_64-linux
jobset at 1153 failing jobs,x86_64-darwin
at >7130, andaarch64-linux
at 7573+.Previous releases first evals
19.09 had 1654 failing jobs. 20.03 had 1204 failing jobs, 20.09 had 1153 failing jobs, So we're actually getting better at maintaining a more stable "unstable" channel.
However, our darwin story isn't as good (we need more darwin reviewers/contributors) 20.03 had 1384 failing jobs, 20.09 had >7130 failing jobs,
How to help (textual)
Select an evaluation of the release-20.09 jobset by #id
Find a failed job ❌️
Troubleshoot why it's failing and fix it
Create a Pull Request with the fix targeting master, wait for it to be merged. Generally the job fails on master also, you can verify that on Hydra - example URL: https://hydra.nixos.org/job/nixpkgs/trunk/bash.x86_64-linux. That means most PR's should be target the
master
branch, however, if your PR causes around 500+ rebuilds, it's preferred to targetstaging
to avoid compute and storage churn.Always reference this issue in the body of your PR:
Please ping @NixOS/nixos-release-managers on the PR. If you're unable to because you're not a member of the NixOS org please ping @jonringer and @worldofpeace (the same people in the team).
How to help video tutorial
@jonringer has made a video on YouTube to guide anyone through how fixing something for ZHF will look like: https://www.youtube.com/watch?v=4Zb3GpIc6vk&
New to nixpkgs?
@jonringer created some videos to help get people started with nixpkgs:
Also be sure to check out other resources at: https://github.com/nix-community/awesome-nix
Packages that don't get fixed
The remaining packages will be marked as broken before the release (on the failing platforms). You can do this like:
These are the utility flags used to test the type of platform.
Closing
This is a great way to help NixOS, and it was some of my earliest contributions. Let's go ✌️
✨️ @worldofpeace and @jonringer
cc @NixOS/nixpkgs-committers @NixOS/nixpkgs-maintainers
Related Issues