JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.89k stars 5.49k forks source link

Julia 1.11.0-1.11.1 hangs while testing/precompiling a basic Test project with `julia_args = ["--threads=auto"]` #56458

Open stemann opened 3 weeks ago

stemann commented 3 weeks ago

Scheduled testing of a very basic Test project with --threads=auto started hanging three weeks ago - since Oct. 8, when Julia 1.11 was released.

The scheduled testing is using the julia:1 container (which became synonymous with julia:1.11 on Oct. 8) and the hang occurs while precompiling the test-project (when running tests).

The scheduled testing is simply running Pkg.test with --threads=auto for a project with only Test as a dependency: https://gitlab.com/stemann/julia-gitlab-ci-templates/-/tree/master/examples/Sample

Without --threads=auto (using Pkg.test(; coverage = true)), the testing completes without issues (of course): https://gitlab.com/stemann/julia-gitlab-ci-templates/-/jobs/8234833097

"Stack trace" (Julia-style):


* [stemann/julia-gitlab-ci-templates: templates.gitlab-ci.yaml#L197-258](https://gitlab.com/stemann/julia-gitlab-ci-templates/-/blob/master/templates.gitlab-ci.yaml#L197-258)
* [stemann/julia-gitlab-ci-templates: test.needs_build.gitlab-ci.yaml](https://gitlab.com/stemann/julia-gitlab-ci-templates/-/blob/master/jobs/test.needs_build.gitlab-ci.yaml)
* [stemann/julia-gitlab-ci-templates: .gitlab-ci.yml#L95-L111](https://gitlab.com/stemann/julia-gitlab-ci-templates/-/blob/master/.gitlab-ci.yml#L95-L111)
stemann commented 3 weeks ago

Vaguely related to #56345

giordano commented 3 weeks ago

Vaguely related to https://github.com/JuliaLang/julia/issues/56345

Does it mean this is fixed on master and #56228 (but that issue was using Distributed, not threads)?

stemann commented 3 weeks ago

It seems reproducible on an x86_64 macOS using Docker constrained to two CPU cores (similar to the GitLab CI SaaS agent):

$ docker run -it --rm -v $(pwd):/mnt -w /mnt julia:1 bash -c "while true; do date; sleep 60; done & julia --project -e '@show Sys.CPU_THREADS; using Pkg; Pkg.test(; coverage = true, julia_args = [\"--threads=auto\"])'"
Tue Nov  5 15:21:57 UTC 2024
Sys.CPU_THREADS = 2
  Installing known registries into `~/.julia`
       Added `General` registry to ~/.julia/registries
     Testing Sample
      Status `/tmp/jl_j26BPk/Project.toml`
  [a9065fac] Sample v0.1.0 `/mnt`
  [8dfed614] Test v1.11.0
      Status `/tmp/jl_j26BPk/Manifest.toml`
  [a9065fac] Sample v0.1.0 `/mnt`
  [2a0f44e3] Base64 v1.11.0
  [b77e0a4c] InteractiveUtils v1.11.0
  [56ddb016] Logging v1.11.0
  [d6f4376e] Markdown v1.11.0
  [9a3f8284] Random v1.11.0
  [ea8e919c] SHA v0.7.0
  [9e88b42a] Serialization v1.11.0
  [8dfed614] Test v1.11.0
Precompiling project for configuration --code-coverage=@/mnt --color=yes --check-bounds=yes --warn-overwrite=yes --depwarn=yes --inline=yes --startup-file=no --track-allocation=none --threads=auto...
Tue Nov  5 15:22:57 UTC 2024                         ]  0/1
Tue Nov  5 15:23:57 UTC 2024                         ]  0/1
Tue Nov  5 15:24:57 UTC 2024                         ]  0/1
  Progress [>                                        ]  0/1
  ◒ Sample

Also when constraining to 4 CPU cores:

$ docker run -it --rm -v $(pwd):/mnt -w /mnt julia:1 bash -c "while true; do date; sleep 60; done & julia --project -e '@show Sys.CPU_THREADS; using Pkg; Pkg.test(; coverage = true, julia_args = [\"--threads=auto\"])'"
Tue Nov  5 15:29:46 UTC 2024
Sys.CPU_THREADS = 4
  Installing known registries into `~/.julia`
       Added `General` registry to ~/.julia/registries
     Testing Sample
      Status `/tmp/jl_aowubp/Project.toml`
  [a9065fac] Sample v0.1.0 `/mnt`
  [8dfed614] Test v1.11.0
      Status `/tmp/jl_aowubp/Manifest.toml`
  [a9065fac] Sample v0.1.0 `/mnt`
  [2a0f44e3] Base64 v1.11.0
  [b77e0a4c] InteractiveUtils v1.11.0
  [56ddb016] Logging v1.11.0
  [d6f4376e] Markdown v1.11.0
  [9a3f8284] Random v1.11.0
  [ea8e919c] SHA v0.7.0
  [9e88b42a] Serialization v1.11.0
  [8dfed614] Test v1.11.0
Precompiling project for configuration --code-coverage=@/mnt --color=yes --check-bounds=yes --warn-overwrite=yes --depwarn=yes --inline=yes --startup-file=no --track-allocation=none --threads=auto...
Tue Nov  5 15:30:46 UTC 2024                         ]  0/1
Tue Nov  5 15:31:46 UTC 2024                         ]  0/1
Tue Nov  5 15:32:46 UTC 2024                         ]  0/1
Tue Nov  5 15:33:46 UTC 2024                         ]  0/1
Tue Nov  5 15:34:46 UTC 2024                         ]  0/1
  Progress [>                                        ]  0/1
  ◐ Sample
stemann commented 3 weeks ago

Problem seems to be resolved on master:

$ docker run -it --rm -v $(pwd):/mnt -w /mnt debian:bookworm bash -c "apt-get update; apt-get --yes install curl; curl -fsSL https://install.julialang.org | sh -s -- --yes --default-channel nightly; . /root/.bashrc; julia --version; while true; do date; sleep 60; done & julia --project -e '@show Sys.CPU_THREADS; using Pkg; Pkg.test(; coverage = true, julia_args = [\"--threads=auto\"])'"
# ...
julia version 1.12.0-DEV
Tue Nov  5 15:41:55 UTC 2024
Sys.CPU_THREADS = 4
  Installing known registries into `~/.julia`
       Added `General` registry to ~/.julia/registries
    Updating registry at `~/.julia/registries/General.toml`
    Updating `/mnt/Project.toml`
  [8dfed614] ~ Test ⇒ v1.11.0
    Updating `/mnt/Manifest.toml`
  [2a0f44e3] + Base64 v1.11.0
  [b77e0a4c] + InteractiveUtils v1.11.0
  [dc6e5ff7] + JuliaSyntaxHighlighting v1.12.0
  [56ddb016] + Logging v1.11.0
  [d6f4376e] + Markdown v1.11.0
  [9a3f8284] + Random v1.11.0
  [ea8e919c] + SHA v0.7.0
  [9e88b42a] + Serialization v1.11.0
  [f489334b] + StyledStrings v1.11.0
  [8dfed614] ~ Test ⇒ v1.11.0
     Testing Sample
      Status `/tmp/jl_ltFcmE/Project.toml`
  [a9065fac] Sample v0.1.0 `/mnt`
  [8dfed614] Test v1.11.0
      Status `/tmp/jl_ltFcmE/Manifest.toml`
  [a9065fac] Sample v0.1.0 `/mnt`
  [2a0f44e3] Base64 v1.11.0
  [b77e0a4c] InteractiveUtils v1.11.0
  [dc6e5ff7] JuliaSyntaxHighlighting v1.12.0
  [56ddb016] Logging v1.11.0
  [d6f4376e] Markdown v1.11.0
  [9a3f8284] Random v1.11.0
  [ea8e919c] SHA v0.7.0
  [9e88b42a] Serialization v1.11.0
  [f489334b] StyledStrings v1.11.0
  [8dfed614] Test v1.11.0
Precompiling for configuration --code-coverage=@/mnt --color=yes --check-bounds=yes --warn-overwrite=yes --depwarn=yes --inline=yes --startup-file=no --track-allocation=none --threads=auto
Precompiling packages finished.
  1 dependency successfully precompiled in 1 seconds. 8 already precompiled.
     Testing Running tests...
Test Summary: | Pass  Total  Time
Sample        |    1      1  1.3s
     Testing Sample tests passed 
stemann commented 3 weeks ago

Alright - managed to narrow it down a bit more: The issue seems to not be related to the number of CPU cores available.

The simple invocation works

julia --threads=auto --project -e "using Pkg; Pkg.test(; coverage = true)"

But the "start Julia without threads, and then ask Pkg.test to run a Julia session with threads" approach hangs (on Julia 1.11.0-1.11.1):

julia --project -e "using Pkg; Pkg.test(; coverage = true, julia_args = [\"--threads=auto\"])"
stemann commented 3 weeks ago

Using

rm Manifest.toml; docker run -it --rm -v $(pwd):/mnt -w /mnt debian:bookworm bash -c "apt-get update; apt-get --yes install curl; curl -fsSL https://install.julialang.org | sh -s -- --yes --default-channel pr55704; . /root/.bashrc; julia --version; julia -e 'using InteractiveUtils; @show versioninfo()'; while true; do date; sleep 60; done & julia --project -e '@show Sys.CPU_THREADS; using Pkg; Pkg.test(; coverage = true, julia_args = [\"--threads=auto\"])'"
stemann commented 1 week ago

Pre-compile during test is still hanging on PR #56228 - tested just now after merge of #56228 (using juliaup channel pr56228).