Closed rowanG077 closed 4 months ago
Very weird is the dissassembly of that function, it's a miscompile?
[nix-shell:/var/lib/systemd/coredump]$ gdb -batch -ex 'file /run/opengl-driver/lib/dri/apple_dri.so' -ex 'disassemble driCreateNewScreen2'
Dump of assembler code for function driCreateNewScreen2:
0x00000000000708e0 <+0>: udf #0
0x00000000000708e4 <+4>: udf #0
0x00000000000708e8 <+8>: udf #0
0x00000000000708ec <+12>: udf #0
0x00000000000708f0 <+16>: udf #0
...
0x0000000000070b3c <+604>: udf #0
0x0000000000070b40 <+608>: udf #0
0x0000000000070b44 <+612>: udf #0
0x0000000000070b48 <+616>: udf #0
0x0000000000070b4c <+620>: udf #0
0x0000000000070b50 <+624>: udf #0
End of assembler dump.
I have no clue what causes this.
The encoding of udf #0
is 0x00000000
. Did something go wrong with the file system? Can you verify the relevant store paths have the right hash? Try nix-store --verify --check-contents
.
Doesn't seem to be an issue. After some thinking it came back with:
> nix-store --verify --check-contents
reading the Nix store...
checking path existence...
checking link hashes...
checking store hashes.
>
Yeah, that means the store is okay. Doesn’t exclude the possibility something went wrong building. What Nixpkgs hash are you on?
nixos-unstable channel at commit 97b17f32362e
not using flakes for my system config.
Perhaps we need to explicitly set withLibunwind
to false?
https://github.com/NixOS/nixpkgs/commit/73f6621a3713fc01c09d2e237bf1cc877be45cfe
libunwind
is already set to disabled in the override.
I just tested the most recent unstable channel: https://releases.nixos.org/nixos/unstable/nixos-24.05pre579329.e92b60158819
With the same problem.
I'm also having mesa issues, specifically when in overlay mode:
https://github.com/tpwrules/nixos-apple-silicon/issues/152
Given the bizarre disassembly above, I wonder if this is more evidence for my llvmPackages theory?
I'm using replace
mode.
According to Janne Grunau in Matrix chat, Fedora Asahi now builds mesa+asahi with libunwind, whereas ALARM built mesa+asahi without it.
This is the build specification file for mesa in Fedora Asahi, which I admit I find confusing:
I'm bisecting using this shell
let
asahi = import ./apple-silicon-support/packages/overlay.nix;
pkgs = import /home/rowan.goemans/Documents/engineering/nixpkgs {
overlays = [ asahi ];
};
in pkgs.mkShell {
shellHook = ''
${pkgs.gdb}/bin/gdb -batch -ex 'file ${pkgs.mesa-asahi-edge.drivers}/lib/dri/apple_dri.so' -ex 'disassemble driCreateNewScreen2'
'';
}
This will relatively quickly tell me when the function broke.
I still think something machine-specific and probably ephemeral went wrong when building and the file ended up corrupt. I set my system to use the latest main
commit of this repo (6e324ab) and the nixpkgs you mention (97b17f32362e). I cannot replicate this issue, nor the faulty disassembly, on my system.
I examined store path /nix/store/yj2pxkiwyf64xi50y6zy7zpmfa705wd6-mesa-24.0.0-drivers/
which came from derivation /nix/store/6qamlsrgfqzgl5k134bvvr6xic4lmzhw-mesa-24.0.0.drv
. Unfortunately it seems the mesa build is not fully deterministic so it's a little hard to figure out exactly what went wrong.
You can fix this by remounting /nix/store
read-write, nuking the store path (and the main output /nix/store/9qx6h9jhxczji2f94rdn4bjlzyw7mjb6-mesa-24.0.0/
) (might also be smart to back them up for examination later), then nix-build --repair
ing the derivation. Make sure you have the derivation beforehand. Once that's done, reboot and make sure it all works, then run nix-store --verify --check-contents
for good measure.
Alternately, and more safely, you could make some trivial change to the Mesa derivation in this repo (e.g. add a fooAttr = "bar";
attribute to the overrides) to force a rebuild.
Can confirm that nuking the nix store paths and just rebuilding fixed it. I have been using nix for ages and never seen anything like this. Thanks for helping me debug this!
I had the same issue just now! weston was crashing with illegal instructions.
I removed the mesa
folders from /nix/store/
path (made a backup before just in case)
and then ran sudo nix-store --repair --verify --check-contents
which succesfully re-build the mesa driver from source.
i could go back to a graphical system right after.
I noticed a lot of people are having issues related to GPU in the repo. I wonder how many could be affected by this issue? Maybe it could be helpful for this trick to be in the maintenance/repair instructions.
It is quite frankly bizarre and scary that this issue appeared again. Is it in the same file? Is a compressed version of those store paths small enough to attach to this issue? I can't promise to provide any input but it would be good to have.
I will have to examine the other issues more carefully but I don't think they are related.
It is not the same mesa, its the newer 24.0.1
Attached below are the store paths: broken-mesa.zip
I upgraded my nix channel and switch to my new configuration. Now everything that requires graphics gives an
Illegal instruction core dumped
. The source is in the asahi mesa driver: