Open MilesCranmer opened 1 year ago
This is the C code where it crashes:
size_t world = jl_atomic_load_acquire(&jl_world_counter);
ct->world_age = world;
if (!has_defs && jl_get_module_infer(m) != 0) {
(void)jl_type_infer(mfunc, world, 0);
}
result = jl_invoke(/*func*/NULL, /*args*/NULL, /*nargs*/0, mfunc); // crashes
ct->world_age = last_age;
https://github.com/JuliaLang/julia/blob/36034abf26062acad4af9dcec7c4fc53b260dbb4/src/toplevel.c#L897
The last PR to change this line where it segfaulted was https://github.com/JuliaLang/julia/pull/31984. @vtjnash @JeffBezanson any advice for how I could debug this? Or is this line unrelated?
We are trying to call into the JIT there, and so perhaps LLVM is computing the jump address incorrectly? The stacktrace is not quite precisely clear enough what that value is that it crashed on. LLVM is planning some fixes for that for AARCH64 in JITLink in the upcoming release though.
Thanks. Should I raise an issue on the main Julia repo or LLVM?
Here's a minimal dockerfile which gives the same error:
FROM julia:1.8.2
RUN julia -e 'using Pkg; Pkg.add("Conda"); Pkg.build("Conda")'
Another interesting clue is that I can actually build this just fine on my ARM-based laptop (M1). It's only when I try to build the arm64
architecture from an amd64
system (i.e., through docker/QEMU) that this error comes up. Does that offer any insight?
To reproduce this with GitHub actions, you could either build this locally on an x86_64 system, using docker build --platform=linux/arm64 -t test .
.
Alternatively, you can create a GitHub action. First, create a Dockerfile in the root directory containing the above. Then, create a workflow file:
name: Docker test
on:
push:
branches:
- "**"
jobs:
docker:
runs-on: ubuntu-latest
strategy:
matrix:
arch: [linux/amd64, linux/arm64]
steps:
- name: Checkout
uses: actions/checkout@v3
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Build and push
uses: docker/build-push-action@v3
with:
context: .
platforms: ${{ matrix.arch }}
push: false
(This could be combined with https://github.com/csexton/debugger-action to interact with it after failure.)
The equivalent issue for M1 was fixed for arm64-darwin in the previous (old) release of LLVM, so that would make sense, so you would likely need to get a version of LLVM master working with Julia master before reporting it.
I experience the same Segfault when simply precompiling the TimeZones
package.
Same setup: multi-architecture build from amd64 host to arm64 target using qemu emulation.
@vtjnash can you point to further issues which could help solving this?
I'm trying to build docker images for PySR (which is built on PyJulia), and the arm64 jobs fail consistently because of a segmentation fault when building Conda.jl. The
amd64
jobs are fine.Here's the traceback:
``` #15 69.87 Building Conda ─→ `~/.julia/scratchspaces/44cfe95a-1eb2-52ea-b672-e2afdf69b78f/6e47d11ea2776bc5627421d59cdcc1296c058071/build.log` #15 84.11 ERROR: LoadError: Error building `Conda`: #15 94.97 #15 94.97 signal (11): Segmentation fault #15 94.97 in expression starting at /root/.julia/packages/Conda/x2UxR/deps/build.jl:106 #15 94.97 top-level scope at /root/.julia/packages/Conda/x2UxR/deps/build.jl:106 #15 94.97 jl_toplevel_eval_flex at /cache/build/default-armageddon-0/julialang/julia-release-1-dot-8/src/toplevel.c:897 #15 94.97 jl_toplevel_eval_flex at /cache/build/default-armageddon-0/julialang/julia-release-1-dot-8/src/toplevel.c:850 #15 94.97 ijl_toplevel_eval_in at /cache/build/default-armageddon-0/julialang/julia-release-1-dot-8/src/toplevel.c:965 #15 94.97 eval at ./boot.jl:368 [inlined] #15 94.97 include_string at ./loading.jl:1428 #15 94.97 _jl_invoke at /cache/build/default-armageddon-0/julialang/julia-release-1-dot-8/src/gf.c:2367 [inlined] #15 94.97 ijl_apply_generic at /cache/build/default-armageddon-0/julialang/julia-release-1-dot-8/src/gf.c:2549 #15 94.97 _include at ./loading.jl:1488 #15 94.97 include at ./client.jl:476 #15 94.97 unknown function (ip: 0x55170ff553) #15 94.97 _jl_invoke at /cache/build/default-armageddon-0/julialang/julia-release-1-dot-8/src/gf.c:2367 [inlined] #15 94.97 ijl_apply_generic at /cache/build/default-armageddon-0/julialang/julia-release-1-dot-8/src/gf.c:2549 #15 94.97 jl_apply at /cache/build/default-armageddon-0/julialang/julia-release-1-dot-8/src/julia.h:1839 [inlined] #15 94.97 do_call at /cache/build/default-armageddon-0/julialang/julia-release-1-dot-8/src/interpreter.c:126 #15 94.97 eval_value at /cache/build/default-armageddon-0/julialang/julia-release-1-dot-8/src/interpreter.c:215 #15 94.97 eval_stmt_value at /cache/build/default-armageddon-0/julialang/julia-release-1-dot-8/src/interpreter.c:166 [inlined] #15 94.97 eval_body at /cache/build/default-armageddon-0/julialang/julia-release-1-dot-8/src/interpreter.c:612 #15 94.97 jl_interpret_toplevel_thunk at /cache/build/default-armageddon-0/julialang/julia-release-1-dot-8/src/interpreter.c:750 #15 94.97 jl_toplevel_eval_flex at /cache/build/default-armageddon-0/julialang/julia-release-1-dot-8/src/toplevel.c:906 #15 94.97 jl_toplevel_eval_flex at /cache/build/default-armageddon-0/julialang/julia-release-1-dot-8/src/toplevel.c:850 #15 94.97 ijl_toplevel_eval_in at /cache/build/default-armageddon-0/julialang/julia-release-1-dot-8/src/toplevel.c:965 #15 94.97 eval at ./boot.jl:368 [inlined] #15 94.97 exec_options at ./client.jl:276 #15 94.97 _start at ./client.jl:522 #15 94.97 jfptr__start_49[479](https://github.com/MilesCranmer/PySR/actions/runs/3474728580/jobs/5808212454#step:7:482) at /opt/julia/lib/julia/sys.so (unknown line) #15 94.97 _jl_invoke at /cache/build/default-armageddon-0/julialang/julia-release-1-dot-8/src/gf.c:2367 [inlined] #15 94.97 ijl_apply_generic at /cache/build/default-armageddon-0/julialang/julia-release-1-dot-8/src/gf.c:2549 #15 94.97 jl_apply at /cache/build/default-armageddon-0/julialang/julia-release-1-dot-8/src/julia.h:1839 [inlined] #15 94.97 true_main at /cache/build/default-armageddon-0/julialang/julia-release-1-dot-8/src/jlapi.c:575 #15 94.97 jl_repl_entrypoint at /cache/build/default-armageddon-0/julialang/julia-release-1-dot-8/src/jlapi.c:719 #15 94.97 main at /cache/build/default-armageddon-0/julialang/julia-release-1-dot-8/cli/loader_exe.c:59 #15 94.97 __libc_start_main at /lib/aarch64-linux-gnu/libc.so.6 (unknown line) #15 94.97 _start at /opt/julia/bin/julia (unknown line) #15 94.97 _start at /opt/julia/bin/julia (unknown line) #15 94.97 Allocations: 873[483](https://github.com/MilesCranmer/PySR/actions/runs/3474728580/jobs/5808212454#step:7:486) (Pool: 872903; Big: 580); GC: 1 #15 94.99 Stacktrace: #15 94.99 [1] pkgerror(msg::String) #15 95.36 @ Pkg.Types /opt/julia/share/julia/stdlib/v1.8/Pkg/src/Types.jl:67 #15 95.49 [2] (::Pkg.Operations.var"#66#73"{Bool, Pkg.Types.Context, String, Pkg.Types.PackageSpec, String})() #15 95.67 @ Pkg.Operations /opt/julia/share/julia/stdlib/v1.8/Pkg/src/Operations.jl:1060 #15 95.67 [3] withenv(::Pkg.Operations.var"#66#73"{Bool, Pkg.Types.Context, String, Pkg.Types.PackageSpec, String}, ::Pair{String, String}, ::Vararg{Pair{String}}) #15 96.24 @ Base ./env.jl:172 #15 96.25 [4] (::Pkg.Operations.var"#107#112"{String, Bool, Bool, Bool, Pkg.Operations.var"#66#73"{Bool, Pkg.Types.Context, String, Pkg.Types.PackageSpec, String}, Pkg.Types.PackageSpec})() #15 96.25 @ Pkg.Operations /opt/julia/share/julia/stdlib/v1.8/Pkg/src/Operations.jl:1619 #15 96.25 [5] with_temp_env(fn::Pkg.Operations.var"#107#112"{String, Bool, Bool, Bool, Pkg.Operations.var"#66#73"{Bool, Pkg.Types.Context, String, Pkg.Types.PackageSpec, String}, Pkg.Types.PackageSpec}, temp_env::String) #15 96.25 @ Pkg.Operations /opt/julia/share/julia/stdlib/v1.8/Pkg/src/Operations.jl:1[493](https://github.com/MilesCranmer/PySR/actions/runs/3474728580/jobs/5808212454#step:7:496) #15 96.25 [6] (::Pkg.Operations.var"#105#110"{Dict{String, Any}, Bool, Bool, Bool, Pkg.Operations.var"#66#73"{Bool, Pkg.Types.Context, String, Pkg.Types.PackageSpec, String}, Pkg.Types.Context, Pkg.Types.PackageSpec, String, Pkg.Types.Project, String})(tmp::String) #15 96.25 @ Pkg.Operations /opt/julia/share/julia/stdlib/v1.8/Pkg/src/Operations.jl:1582 #15 96.25 [7] mktempdir(fn::Pkg.Operations.var"#105#110"{Dict{String, Any}, Bool, Bool, Bool, Pkg.Operations.var"#66#73"{Bool, Pkg.Types.Context, String, Pkg.Types.PackageSpec, String}, Pkg.Types.Context, Pkg.Types.PackageSpec, String, Pkg.Types.Project, String}, parent::String; prefix::String) #15 96.26 @ Base.Filesystem ./file.jl:764 #15 96.26 [8] mktempdir(fn::Function, parent::String) (repeats 2 times) #15 96.26 @ Base.Filesystem ./file.jl:760 #15 96.26 [9] sandbox(fn::Function, ctx::Pkg.Types.Context, target::Pkg.Types.PackageSpec, target_path::String, sandbox_path::String, sandbox_project_override::Pkg.Types.Project; preferences::Dict{String, Any}, force_latest_compatible_version::Bool, allow_earlier_backwards_compatible_versions::Bool, allow_reresolve::Bool) #15 96.27 @ Pkg.Operations /opt/julia/share/julia/stdlib/v1.8/Pkg/src/Operations.jl:1540 #15 96.27 [10] build_versions(ctx::Pkg.Types.Context, uuids::Set{Base.UUID}; verbose::Bool) #15 96.27 @ Pkg.Operations /opt/julia/share/julia/stdlib/v1.8/Pkg/src/Operations.jl:1041 #15 96.27 [11] build_versions #15 96.27 @ /opt/julia/share/julia/stdlib/v1.8/Pkg/src/Operations.jl:956 [inlined] #15 96.27 [12] add(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}, new_git::Set{Base.UUID}; preserve::Pkg.Types.PreserveLevel, platform::Base.BinaryPlatforms.Platform) #15 96.28 @ Pkg.Operations /opt/julia/share/julia/stdlib/v1.8/Pkg/src/Operations.jl:1286 #15 96.29 [13] add(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; preserve::Pkg.Types.PreserveLevel, platform::Base.BinaryPlatforms.Platform, kwargs::Base.Pairs{Symbol, Base.PipeEndpoint, Tuple{Symbol}, NamedTuple{(:io,), Tuple{Base.PipeEndpoint}}}) #15 96.58 @ Pkg.API /opt/julia/share/julia/stdlib/v1.8/Pkg/src/API.jl:275 #15 96.59 [14] add(pkgs::Vector{Pkg.Types.PackageSpec}; io::Base.PipeEndpoint, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}) #15 96.74 @ Pkg.API /opt/julia/share/julia/stdlib/v1.8/Pkg/src/API.jl:156 #15 96.74 [15] add(pkgs::Vector{Pkg.Types.PackageSpec}) #15 96.75 @ Pkg.API /opt/julia/share/julia/stdlib/v1.8/Pkg/src/API.jl:145 #15 96.75 [16] #add#27 #15 96.75 @ /opt/julia/share/julia/stdlib/v1.8/Pkg/src/API.jl:144 [inlined] #15 96.75 [17] add #15 96.75 @ /opt/julia/share/julia/stdlib/v1.8/Pkg/src/API.jl:144 [inlined] #15 96.75 [18] #add#26 #15 96.75 @ /opt/julia/share/julia/stdlib/v1.8/Pkg/src/API.jl:143 [inlined] #15 96.75 [19] add(pkg::String) #15 96.75 @ Pkg.API /opt/julia/share/julia/stdlib/v1.8/Pkg/src/API.jl:143 #15 96.75 [20] top-level scope #15 96.75 @ /usr/local/lib/python3.10/site-packages/julia/install.jl:118 #15 96.75 in expression starting at /usr/local/lib/python3.10/site-packages/julia/install.jl:73 #15 96.81 Traceback (most recent call last): #15 96.81 File "Here's the job result, the dockerfile, and the action file. This same error occurs every time I run the job.
ubuntu-latest
python:latest
(platform=linux/arm64
)The line it's getting a segfault on in build.jl: https://github.com/JuliaPy/Conda.jl/blob/8f7133206f3efb6308dff5a2b09393d10e6cc122/deps/build.jl#L106
Any idea what this is? @mkitti would you happen to know?