Closed the-kenny closed 6 years ago
Note that master
looks fixed for erlangR20
only. nix-build . -A beam.packages.erlang.rebar
still fails.
On release-18.03
both erlangR20.rebar
and erlang.rebar
fail.
@dtzWill This seems somehow related the busybox changes in #36919 - any idea how these can degrade the stability of the erlang builds?
I found that there were missing boot script in the erlang bin after installing. I don't know what the root case is, but this workaround resolved it for me:
diff --git a/pkgs/development/interpreters/erlang/generic-builder.nix b/pkgs/development/interpreters/erlang/generic-builder.nix
index 1d2b79074fb..e337c8a4041 100644
--- a/pkgs/development/interpreters/erlang/generic-builder.nix
+++ b/pkgs/development/interpreters/erlang/generic-builder.nix
@@ -91,6 +91,8 @@ in stdenv.mkDerivation ({
${postInstall}
ln -s $out/lib/erlang/lib/erl_interface*/bin/erl_call $out/bin/erl_call
+
+ cp $out/lib/erlang/releases/*/start_*.boot $out/lib/erlang/bin/
'';
# Some erlang bin/ scripts run sed and awk
I tried using the cdv
script (ended up using one from 17.09's erlangR20) and it too complained about missing "start.boot" or so. But... why did this start happening? :(
Bit more info, but still investigating:
So we're most likely looking at some sort of impurity here? If so, the next step would be investigating what Erlang in the installPhase
where it's supposed to copy start_*.boot
as manually copying seems to work fine.
Alright so I got it. Short version: it's builders that haven't updated to use fixed /bin/sh
. Good call folks :).
Longer version: Looking at success/fail builds on Hydra, the corresponding erlang builds used are these:
/nix/store/pxxiimf801v5hf3fxf5k12ygf54p1z28-erlang-19.3.6.4
/nix/store/m8ylg6j3hb8npqc908hqavw53mhr338v-erlang-19.3.6.4
diff'ing with:
$ diff -u <(nix log /nix/store/pxxiimf801v5hf3fxf5k12ygf54p1z28-erlang-19.3.6.4) <(nix log /nix/store/m8ylg6j3hb8npqc908hqavw53mhr338v-erlang-19.3.6.4)
Produces this: https://gist.github.com/dtzWill/d4cad2e4f8087699383f169e5681fdaa
In particular:
https://gist.github.com/dtzWill/d4cad2e4f8087699383f169e5681fdaa#file-good-vs-bad-diff-L114-L124
Lacking support for "command" corresponds to needing to update to use fixed sh :).
Could it be that the following simple fix is enough? This just makes sure that we run patchShebangs
before we run the rest of postPatch
(also links run_erl
before running the rest of postInstall
).
diff --git a/pkgs/development/interpreters/erlang/generic-builder.nix b/pkgs/development/interpreters/erlang/generic-builder.nix
index 1d2b79074fb..6ea3ac73a4b 100644
--- a/pkgs/development/interpreters/erlang/generic-builder.nix
+++ b/pkgs/development/interpreters/erlang/generic-builder.nix
@@ -65,9 +65,9 @@ in stdenv.mkDerivation ({
'';
postPatch = ''
- ${postPatch}
-
patchShebangs make
+
+ ${postPatch}
'';
preConfigure = ''
@@ -88,9 +88,9 @@ in stdenv.mkDerivation ({
# (PDFs are generated only when fop is available).
postInstall = ''
- ${postInstall}
-
ln -s $out/lib/erlang/lib/erl_interface*/bin/erl_call $out/bin/erl_call
+
+ ${postInstall}
'';
# Some erlang bin/ scripts run sed and awk`
Note the above patch also fixes https://github.com/NixOS/nixpkgs/issues/37638.
To see if this fixes things you need to do so on a builder that fails as-is. Yours shouldn't, mine doesn't, etc.-- right now anything that convinces Nix to build (on your builder) is enough to fix it.
Does your proposed change help a broken builder? If so that'd be great!
Just pushed 3e61f3b911c to master
. Now waiting for Hydra (& reports in #36823 and #37638).
If it works out fine we should cherry-pick it to release-18.03
.
I can still produce non-functioning Erlang builds with those changes applied. As @kamilchm points out it seems to come down to few missing .boot
files. An example difference between a good and a bad build:
--- tree1 2018-03-24 14:12:40.295145433 +0000
+++ tree2 2018-03-24 14:12:51.304084842 +0000
@@ -5,7 +5,7 @@
│ ├── epmd -> ../lib/erlang/bin/epmd
│ ├── erl -> ../lib/erlang/bin/erl
│ ├── erlc -> ../lib/erlang/bin/erlc
-│ ├── erl_call -> /nix/store/f0ncniw2g22k381ba74b8yp22j115skp-erlang-19.3.6.4/lib/erlang/lib/erl_interface-3.9.3/bin/erl_call
+│ ├── erl_call -> /nix/store/9llmw5m5mp1n524mq75dph1hlrgy56vs-erlang-19.3.6.4/lib/erlang/lib/erl_interface-3.9.3/bin/erl_call
│ ├── escript -> ../lib/erlang/bin/escript
│ ├── run_erl -> ../lib/erlang/bin/run_erl
│ ├── to_erl -> ../lib/erlang/bin/to_erl
@@ -22,10 +22,7 @@
│ │ ├── no_dot_erlang.boot
│ │ ├── run_erl
│ │ ├── start
-│ │ ├── start.boot
-│ │ ├── start_clean.boot
│ │ ├── start_erl
-│ │ ├── start_sasl.boot
│ │ ├── start.script
│ │ ├── to_erl
│ │ └── typer
@@ -7724,4 +7721,4 @@
└── nix-support
└── setup-hook
-559 directories, 7165 files
+559 directories, 7162 files
Both store paths should be cached on Hydra and can be retrieved with nix-store -r /nix/store/{f0ncniw2g22k381ba74b8yp22j115skp-erlang-19.3.6.4,9llmw5m5mp1n524mq75dph1hlrgy56vs-erlang-19.3.6.4}
. I can't find the build links right now but IIRC they were produced by different builders.
The missing boot files seem to be due to what's observed in https://github.com/NixOS/nixpkgs/issues/36823#issuecomment-372294868 - in short in my setup sandbox /bin/sh
can't expand certain globs, those globs are needed to copy the .boot
files. As I understand it some Hydra builders might still have this issue (EC2 builders specifically is my own personal speculation).
Locally applying
diff --git a/pkgs/development/interpreters/erlang/generic-builder.nix b/pkgs/development/interpreters/erlang/generic-builder.nix
index 6ea3ac73a4b..cf8fe1f6e56 100644
--- a/pkgs/development/interpreters/erlang/generic-builder.nix
+++ b/pkgs/development/interpreters/erlang/generic-builder.nix
@@ -68,6 +68,9 @@ in stdenv.mkDerivation ({
patchShebangs make
${postPatch}
+
+ substituteInPlace erts/etc/unix/Install.src \
+ --replace "#!/bin/sh" "${stdenv.shell}"
'';
preConfigure = ''
seems to produce more friendly Erlang (R19) that can build rebar, couchdb, and rabbitmq_server. I can't rebuild all Erlang things right now.
Hmm, something is wrong still :-(, elixir won't build..
runs patchShebangs
over elixir source ...
Yep, that did it.
Looks like it's specifically bin/elixir
that needs patching (similar globbing issue it seems). Everything else seems to build fine.
Issue description
nixpkgs.rebar
fails to build with a cryptic error message, seemingly caused by a totally unrelated change tobuildRubyGem
in fced35fa44098be0296d8b42166583bd5e505141:git-bisect
points to fced35fa44098be0296d8b42166583bd5e505141 as the cause which seems unrelated. However, reverting this commit fixes the build.@aneeshusa investigated this some more in https://github.com/NixOS/nixpkgs/commit/fced35fa44098be0296d8b42166583bd5e505141#commitcomment-27975737 - here is a copy:
@aneeshusa's investigation:
@the-kenny I did some digging but didn't find anything concrete. I was able to reproduce your git bisect result. Looking at the rebar -> ruby dependency chain, this is what I see:
All of these build except the final rebar3, and ronn is a manual building tool, so I'm not really sure how the ruby change broke rebar3. I also tried a couple of other erlang releases (R19, etc.) which seemed to get past the error the R20 release encountered very early on.
I also did
nix-shell --pure -A beam.packages.erlangR20.rebar
and didn't see anything ruby or ronn related. I do have sandboxing turned on.I added a trace to this file to print out the gemName on each instantation; this is the output when instantiating R20 rebar:
This is also the same trace for
-A ronn
, so these all seem strictly ronn-related.My only wild remaining guesses are:
Hopefully this helps a bit.
Steps to reproduce
nix build nixpkgs.rebar
git revert fced35fa44098be0296d8b42166583bd5e505141
nix build nixpkgs.rebar
Technical details
Similar Issues
36823 looks similar, but is caused by another (also seemingly unrelated) commit.