JuliaMolSim / Molly.jl

Molecular simulation in Julia
Other
390 stars 53 forks source link

KernelAbstractions support #147

Open leios opened 1 year ago

leios commented 1 year ago

The KernelAbstractions branch now compiles, so I thought I would put forward a quick draft PR while I figure out all the runtime bugs.

Notes:

  1. This builds off of #99 and should replace it entirely
  2. CUDA has been removed and replaced with KernelAbstractions and GPUArrays. As an important note here, GPUArrays is not strictly necessary except to replicate the behavior of the boolean GPU flag (ie isa(a, AbstractGPUArray)).
  3. If this is merged, other GPU types (Metal, AMD, Intel) will also be supported, but I can only test on AMD (and maybe Metal if I can get someone to try it with a mac).
  4. I need to add in the changes from #133. If there is something we are missing on the KernelAbstractions side, I can try to add it in, but I think we are good to go.
leios commented 1 year ago

Ah, I guess while I'm here, I'll briefly explain the differences with CUDA syntactically:

  1. Indexing is easier: @index(Global / Group / Local, Linear / NTuple / CartesianIndex) vs (blockIdx().x - 1) * blockDim().x + threadIdx().x for CUDA
  2. Kernels run off of an ndrange for the range of elements (OpenCL inspired syntax)
  3. Launching kernels requires configuration with a backend, see: https://github.com/leios/Molly.jl/blob/KA_support/src/kernels.jl#L21
  4. Certain functions now execute on the backend CUDA.zeros(...) -> zeros(backend, args...)

The tricky thing about this PR was removing the CUDA dependency outside of the kernels. There is still one call in zygote.jl I gotta figure out: https://github.com/leios/Molly.jl/blob/KA_support/src/zygote.jl#L698

jgreener64 commented 1 year ago

Great work so far. Making the code compatible with generic array types is a nice change, and having the kernels work on different devices would be a selling point of the package.

I would be interested to see the performance of the kernels compared to the CUDA versions. Also whether it plays nicely with Enzyme. Good luck with the runtime errors.

leios commented 1 year ago

I think I can finish this up today or else early next week (emphasis on think), but to quickly answer the questions:

  1. KA essentially just writes vendor-specific code (ie CUDA) from the generic code input, so if we don't have identical performance to CUDA, then that's a bug. I'll do the performance testing similar to https://github.com/JuliaMolSim/Molly.jl/pull/133 once the code is cleaned up.
  2. Enzyme should also not be an issue; however, there are some reports of error handling being an issue: https://github.com/EnzymeAD/Enzyme.jl/issues/365
jgreener64 commented 1 year ago

Great. Not urgent, but how well does KernelAbstractions.jl deal with warp-level code, e.g. warpsize() and sync_warp()?

leios commented 1 year ago

That's a good question. We can probably expose the APIs available from CUDA, but I am not sure how AMDGPU deals with these. We would also just need to figure out what that corresponds to on parallel CPU.

I think these are the tools we need: https://rocm.docs.amd.com/projects/rocPRIM/en/latest/warp_ops/index.html So they are available, it's just a matter of exposing them in KA and figuring out what it corresponds to for different backends.

Ah, as an important note (that I somehow failed to mention before), an advantage of KA is that it also provides a parallel CPU implentation, so the kernels can be written once and executed everywhere. I didn't do that in this PR because that brings up design questions related to Molly internals.

jgreener64 commented 1 year ago

I didn't do that in this PR because that brings up design questions related to Molly internals.

Yeah we can discuss that after this PR. I would be okay with switching if there was no performance hit.

Judging from discussion on the linked PR there is not currently warp support in KA. It may be necessary to leave that CUDA kernel in and have a separate KA kernel for other backends until warp support comes to KA.

leios commented 1 year ago

Ok, so a couple of quick notes here:

  1. There are a few host calls that are not yet supported by AMDGPU (such as findall). My understanding was that such calls would eventually be ported to GPUArrays, but I don't think that has happened yet. Note that some of the stalling here is because we are waiting to get KA into GPUArrays (https://github.com/JuliaGPU/GPUArrays.jl/pull/451). At least for findall, the kernel is not that complex: https://github.com/JuliaGPU/CUDA.jl/blob/master/src/indexing.jl#L23, so we could put it into AMDGPU or something for now; however, we are stuck on an older version of AMDGPU due to some package conflicts. The quick fix would be to do it the ol' fashioned way and just stick the necessary kernels in Molly under a file like, kernel_hacks,.jl or something. Such issues were also what stalled #99.
  2. 133 seems to only use warpsize and warp_sync for warp-level semantics. The KA kernel would probably get the warpsize on the host and then pass it in as a parameter. warp_sync is a bit more interesting because, well, at least in the old days warps didn't need any synchronizing. It seems that things changed in Volta and most people missed the memo. Because of this, the easiest thing to do would be to keep the CUDA dependency for that one kernel. We could also add in warp-level semantics to KA, but that would take some time to propagate to all the independent GPU APIs and (as mentioned in 1), we are kinda stuck on older versions of AMDGPU and CUDA because of compatability with other packages.

  3. I am realizing that there is a greater conflict with this PR. Namely, I don't know if I have the bandwidth to do any sort of maintainence on Molly after this PR is in. I don't know if it's fair to ask you to merge 1000 lines of code with a new API and then leave. On the other hand, getting this to work on AMD would be great and really useful. Let me think on that.
jgreener64 commented 1 year ago

Because of this, the easiest thing to do would be to keep the CUDA dependency for that one kernel.

That is okay.

I don't know if it's fair to ask you to merge 1000 lines of code with a new API and then leave.

I wouldn't worry about this. Currently I only merge stuff that I am able to maintain, or where I think I can skill up to the point of maintaining it. The changes here seem reasonable and worth merging once any errors and performance regressions are fixed. There is a wider question about whether KernelAbstractions.jl will continue to be maintained compared to CUDA.jl, but it seems to have decent traction now.

leios commented 1 year ago

Yeah, the plan is for KA to be used even within GPUArrays, so it's not going anywhere anytime soon. Speaking of which, the "correct" course of action for KA in Molly would be to get the KA in GPUArrays first and then use that to implement any missing features on the GPUArrays level.

Would it be better for me to separate this PR then? Maybe one doing the generic Array stuff and then another with the KA support?

jgreener64 commented 1 year ago

I would try and get this PR working as is. Only if that becomes difficult would it be worth splitting out and merging the generic array support.

If KA is here for the long haul then there is a benefit to switching the kernels even if only CUDA works currently. Because then when changes happen elsewhere, AMDGPU will work without any changes required in Molly.

codecov[bot] commented 3 months ago

Codecov Report

Attention: Patch coverage is 12.82660% with 367 lines in your changes missing coverage. Please review.

Project coverage is 69.40%. Comparing base (18a1991) to head (06de12a).

Files with missing lines Patch % Lines
src/kernels.jl 1.07% 184 Missing :warning:
src/chain_rules.jl 0.99% 100 Missing :warning:
src/interactions/implicit_solvent.jl 5.26% 36 Missing :warning:
ext/MollyCUDAExt.jl 0.00% 12 Missing :warning:
src/neighbors.jl 28.57% 10 Missing :warning:
src/spatial.jl 33.33% 8 Missing :warning:
src/zygote.jl 0.00% 8 Missing :warning:
ext/MollyPythonCallExt.jl 0.00% 2 Missing :warning:
src/energy.jl 33.33% 2 Missing :warning:
src/force.jl 33.33% 2 Missing :warning:
... and 2 more
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #147 +/- ## ========================================== - Coverage 71.68% 69.40% -2.29% ========================================== Files 37 38 +1 Lines 5549 5726 +177 ========================================== - Hits 3978 3974 -4 - Misses 1571 1752 +181 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

leios commented 1 day ago

Getting around to this and noticed a bunch of segfaults in the CPU tests. I then found that there's a strange conflict between AMDGPU and Molly. Even on the master branch, this script will create a segfault:

using Molly

n_atoms = 100
atom_mass = 10.0f0u"g/mol"
boundary = CubicBoundary(2.0f0u"nm")
temp = 100.0f0u"K"
cpu_coords = place_atoms(n_atoms, boundary; min_dist=0.3u"nm")
cpu_atoms = Array([Atom(mass=atom_mass, σ=0.3f0u"nm", ϵ=0.2f0u"kJ * mol^-1") for
 i in 1:n_atoms])
cpu_velocities = Array([random_velocity(atom_mass, temp) for i in 1:n_atoms])
cpu_simulator = VelocityVerlet(dt=0.002f0u"ps")

cpu_sys = System(
    atoms=cpu_atoms,
    coords=cpu_coords,
    boundary=boundary,
    velocities=cpu_velocities,
    pairwise_inters=(LennardJones(),),
    loggers=(
        temp=TemperatureLogger(typeof(1.0f0u"K"), 10),
        coords=CoordinateLogger(typeof(1.0f0u"nm"), 10),
    ),
)

simulate!(deepcopy(cpu_sys), cpu_simulator, 20) # Compile function
simulate!(cpu_sys, cpu_simulator, 2000)

But only if AMDGPU is loaded before include("cpu.jl"). Not sure how to go about debugging this on, but writing it down so it is documented somewhere. The segfault:


julia> include("tmp/cpu.jl")
System with 100 atoms, boundary CubicBoundary{Quantity{Float32, 𝐋, Unitful.FreeUnits{(nm,), 𝐋, nothing}}}(Quantity{Float32, 𝐋, Unitful.FreeUnits{(nm,), 𝐋, nothing}}[2.0f0 nm, 2.0f0 nm, 2.0f0 nm])

julia> 
[leios@noema Molly.jl]$ julia --project -t 12
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.10.2 (2024-03-01)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using AMDGPU

julia> include("tmp/cpu.jl")
[1727708527.809644] [noema:37885:0]        spinlock.c:29   UCX  WARN  ucs_recursive_spinlock_destroy() failed: busy
[1727708527.809644] [noema:37885:1]        spinlock.c:29   UCX  WARN  ucs_recursive_spinlock_destroy() failed: busy
[noema:37885:0:37894] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x756d3781c008)
[noema:37885:1:37897] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x756d3781c008)
[1727708527.809644] [noema:37885:3]        spinlock.c:29   UCX  WARN  ucs_recursive_spinlock_destroy() failed: busy
[noema:37885:3:37893] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x756d3781c008)
[1727708527.809649] [noema:37885:2]           debug.c:1297 UCX  WARN  ucs_debug_disable_signal: signal 1 was not set in ucs
[noema:37885:2:37892] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x756d3781c008)
[1727708527.809644] [noema:37885:4]        spinlock.c:29   UCX  WARN  ucs_recursive_spinlock_destroy() failed: busy
[noema:37885:4:37889] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x756d3781c008)
[noema:37885:6:37890] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x756d3781c008)
[1727708527.809728] [noema:37885:0]        spinlock.c:29   UCX  WARN  ucs_recursive_spinlock_destroy() failed: busy
[noema:37885:7:37888] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x756d3781c008)
[noema:37885:8:37891] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x756d3781c008)
[noema:37885:9:37895] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x756d3781c008)
[1727708527.809730] [noema:37885:1]        spinlock.c:29   UCX  WARN  ucs_recursive_spinlock_destroy() failed: busy
[1727708527.809735] [noema:37885:5]        spinlock.c:29   UCX  WARN  ucs_recursive_spinlock_destroy() failed: busy
[noema:37885:5:37898] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x756d3781c008)
[1727708527.809741] [noema:37885:3]        spinlock.c:29   UCX  WARN  ucs_recursive_spinlock_destroy() failed: busy
[noema:37885:10:37896] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x756d3781c008)
==== backtrace (tid:  37894) ====
 0 0x000000000004d212 ucs_event_set_fd_get()  ???:0
 1 0x000000000004d3dd ucs_event_set_fd_get()  ???:0
 2 0x000000000003d1d0 __sigaction()  ???:0
 3 0x00000000000845d4 ijl_process_events()  /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/jl_uv.c:277
 4 0x0000000000097f8d ijl_task_get_next()  /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/partr.c:524
 5 0x0000000001cb0bd8 julia_poptask_75383()  ./task.jl:985
 6 0x0000000001cb0bd8 julia_poptask_75383()  ./task.jl:987
 7 0x0000000000997f72 julia_wait_74665()  ./task.jl:994
 8 0x0000000000962c1c julia_task_done_hook_75296()  ./task.jl:675
 9 0x0000000001443a97 jfptr_task_done_hook_75297.1()  :0
10 0x0000000000046a0e _jl_invoke()  /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:2894
11 0x0000000000069c17 jl_apply()  /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/julia.h:1982
12 0x0000000000069d9e start_task()  /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/task.c:1249
=================================

[37885] signal (11.-6): Segmentation fault
in expression starting at /home/leios/projects/CESMIX/Molly.jl/tmp/cpu.jl:25
ijl_process_events at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/jl_uv.c:277
ijl_task_get_next at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/partr.c:524
poptask at ./task.jl:985
wait at ./task.jl:994
task_done_hook at ./task.jl:675
jfptr_task_done_hook_75297.1 at /home/leios/builds/julia-1.10.2/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:2894 [inlined]
ijl_apply_generic at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:3076
jl_apply at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined]
jl_finish_task at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/task.c:320
start_task at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/task.c:1249
Allocations: 39536048 (Pool: 39464158; Big: 71890); GC: 46
Segmentation fault (core dumped)

st:

(Molly) pkg> st
Project Molly v0.21.1
Status `~/projects/CESMIX/Molly.jl/Project.toml`
  [a9b6321e] Atomix v0.1.0
⌅ [a963bdd2] AtomsBase v0.3.5
  [a3e0e189] AtomsCalculators v0.2.2
  [de9282ab] BioStructures v4.2.0
⌃ [052768ef] CUDA v5.4.3
  [69e1c6dd] CellListMap v0.9.6
  [082447d4] ChainRules v1.71.0
  [d360d2e6] ChainRulesCore v1.25.0
  [46823bd8] Chemfiles v0.10.41
  [861a8166] Combinatorics v1.0.2
  [864edb3b] DataStructures v0.18.20
  [b4f34e82] Distances v0.10.11
  [31c24e10] Distributions v0.25.112
⌅ [7da242da] Enzyme v0.12.36
  [8f5d6c58] EzXML v1.2.0
  [cc61a311] FLoops v0.2.2
  [f6369f11] ForwardDiff v0.10.36
  [86223c79] Graphs v1.12.0
  [5ab0869b] KernelDensity v0.6.9
  [b8a86587] NearestNeighbors v0.4.20
  [7b2266bf] PeriodicTable v1.2.1
  [189a3867] Reexport v1.2.2
⌅ [64031d72] SimpleCrystals v0.2.0
  [90137ffa] StaticArrays v1.9.7
  [1986cc42] Unitful v1.21.0
  [a7773ee8] UnitfulAtomic v1.0.0
  [f31437dd] UnitfulChainRules v0.1.2
  [d80eeb9a] UnsafeAtomicsLLVM v0.2.1
  [e88e6eb3] Zygote v0.6.71
  [37e2e46d] LinearAlgebra
  [9a3f8284] Random
  [2f01184e] SparseArrays v1.10.0
  [10745b16] Statistics v1.10.0
Info Packages marked with ⌃ and ⌅ have new versions available. Those with ⌃ may be upgradable, but those with ⌅ are restricted by compatibility constraints from upgrading. To see why use `status --outdated`

Note that using a single thread "fixes" the issue. It seems to be a UCX / MPI issue, but I am not loading them and neither are in the Manifest.

vchuravy commented 1 day ago

This looks exactly like https://juliaparallel.org/MPI.jl/stable/knownissues/#Multi-threading-and-signal-handling

What is st -m?

leios commented 1 day ago

The fix mentioned there seems to work:

[leios@noema Molly.jl]$ export UCX_ERROR_SIGNALS="SIGILL,SIGBUS,SIGFPE"
[leios@noema Molly.jl]$ julia --project -t 12
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.10.2 (2024-03-01)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using AMDGPU

julia> include("tmp/cpu.jl")
System with 100 atoms, boundary CubicBoundary{Quantity{Float32, 𝐋, Unitful.FreeUnits{(nm,), 𝐋, nothing}}}(Quantity{Float32, 𝐋, Unitful.FreeUnits{(nm,), 𝐋, nothing}}[2.0f0 nm, 2.0f0 nm, 2.0f0 nm])

julia> 

Note st - m has no MPI or UCX

(Molly) pkg> st -m
Project Molly v0.21.1
Status `~/projects/CESMIX/Molly.jl/Manifest.toml`
  [621f4979] AbstractFFTs v1.5.0
  [7d9f7c33] Accessors v0.1.38
  [79e6a3ab] Adapt v4.0.4
  [66dad0bd] AliasTables v1.1.3
  [dce04be8] ArgCheck v2.3.0
  [ec485272] ArnoldiMethod v0.4.0
  [a9b6321e] Atomix v0.1.0
⌅ [a963bdd2] AtomsBase v0.3.5
  [a3e0e189] AtomsCalculators v0.2.2
  [13072b0f] AxisAlgorithms v1.1.0
  [ab4f0b2a] BFloat16s v0.5.0
  [198e06fe] BangBang v0.4.3
  [9718e550] Baselet v0.1.1
  [47718e42] BioGenerics v0.1.5
  [de9282ab] BioStructures v4.2.0
  [3c28c6f8] BioSymbols v5.1.3
  [fa961155] CEnum v0.5.0
⌃ [052768ef] CUDA v5.4.3
  [1af6417a] CUDA_Runtime_Discovery v0.3.5
  [69e1c6dd] CellListMap v0.9.6
  [082447d4] ChainRules v1.71.0
  [d360d2e6] ChainRulesCore v1.25.0
  [46823bd8] Chemfiles v0.10.41
  [944b1d66] CodecZlib v0.7.6
  [3da002f7] ColorTypes v0.11.5
  [5ae59095] Colors v0.12.11
  [861a8166] Combinatorics v1.0.2
  [bbf7d656] CommonSubexpressions v0.3.1
  [34da2185] Compat v4.16.0
  [a33af91c] CompositionsBase v0.1.2
  [187b0558] ConstructionBase v1.5.8
  [6add18c4] ContextVariablesX v0.1.3
  [a8cc5b0e] Crayons v4.1.1
  [9a962f9c] DataAPI v1.16.0
  [a93c6f00] DataFrames v1.7.0
  [864edb3b] DataStructures v0.18.20
  [e2d170a0] DataValueInterfaces v1.0.0
  [244e2a9f] DefineSingletons v0.1.2
  [163ba53b] DiffResults v1.1.0
  [b552c78f] DiffRules v1.15.1
  [b4f34e82] Distances v0.10.11
  [31c24e10] Distributions v0.25.112
  [ffbed154] DocStringExtensions v0.9.3
⌅ [7da242da] Enzyme v0.12.36
⌅ [f151be2c] EnzymeCore v0.7.8
  [e2ba6199] ExprTools v0.1.10
  [8f5d6c58] EzXML v1.2.0
  [7a1cc6ca] FFTW v1.8.0
  [cc61a311] FLoops v0.2.2
  [b9860ae5] FLoopsBase v0.1.1
  [1a297f60] FillArrays v1.13.0
  [53c48c17] FixedPointNumbers v0.8.5
  [1fa38f19] Format v1.3.7
  [f6369f11] ForwardDiff v0.10.36
  [0c68f7d7] GPUArrays v10.3.1
  [46192b85] GPUArraysCore v0.1.6
⌅ [61eb1bfa] GPUCompiler v0.26.7
  [86223c79] Graphs v1.12.0
  [34004b35] HypergeometricFunctions v0.3.24
  [7869d1d1] IRTools v0.4.14
  [d25df0c9] Inflate v0.1.5
  [22cec73e] InitialValues v0.3.1
  [842dd82b] InlineStrings v1.4.2
  [a98d9a8b] Interpolations v0.15.1
  [3587e190] InverseFunctions v0.1.17
  [41ab1584] InvertedIndices v1.3.0
  [92d709cd] IrrationalConstants v0.2.2
  [82899510] IteratorInterfaceExtensions v1.0.0
  [692b3bcd] JLLWrappers v1.6.0
  [b14d175d] JuliaVariables v0.2.4
⌃ [63c18a36] KernelAbstractions v0.9.26
  [5ab0869b] KernelDensity v0.6.9
⌅ [929cbde3] LLVM v8.1.0
  [8b046642] LLVMLoopInfo v1.0.0
  [b964fa9f] LaTeXStrings v1.3.1
  [2ab3a3ac] LogExpFunctions v0.3.28
  [d8e11817] MLStyle v0.4.17
  [1914dd2f] MacroTools v0.5.13
  [128add7d] MicroCollections v0.2.0
  [e1d29d7a] Missings v1.2.0
  [5da4648a] NVTX v0.3.4
  [77ba4419] NaNMath v1.0.2
  [71a1bf82] NameResolution v0.1.5
  [b8a86587] NearestNeighbors v0.4.20
  [d8793406] ObjectFile v0.4.2
  [6fe1bfb0] OffsetArrays v1.14.1
  [bac558e1] OrderedCollections v1.6.3
  [90014a1f] PDMats v0.11.31
  [d96e819e] Parameters v0.12.3
  [7b2266bf] PeriodicTable v1.2.1
  [2dfb63ee] PooledArrays v1.4.3
  [aea7be01] PrecompileTools v1.2.1
  [21216c6a] Preferences v1.4.3
  [8162dcfd] PrettyPrint v0.2.0
  [08abe8d2] PrettyTables v2.4.0
  [92933f4c] ProgressMeter v1.10.2
  [43287f4e] PtrArrays v1.2.1
  [1fd47b50] QuadGK v2.11.1
  [74087812] Random123 v1.7.0
  [e6cf234a] RandomNumbers v1.6.0
  [c84ed2f1] Ratios v0.4.5
  [c1ae055f] RealDot v0.1.0
  [3cdcf5f2] RecipesBase v1.3.4
  [189a3867] Reexport v1.2.2
  [ae029012] Requires v1.3.0
  [79098fc4] Rmath v0.8.0
  [6c6a2e73] Scratch v1.2.1
  [91c51154] SentinelArrays v1.4.5
  [efcf1570] Setfield v1.1.1
⌅ [64031d72] SimpleCrystals v0.2.0
  [699a6c99] SimpleTraits v0.9.4
  [a2af1166] SortingAlgorithms v1.2.1
  [dc90abb0] SparseInverseSubset v0.1.2
  [276daf66] SpecialFunctions v2.4.0
  [171d559e] SplittablesBase v0.1.15
  [90137ffa] StaticArrays v1.9.7
  [1e83bf80] StaticArraysCore v1.4.3
  [82ae8749] StatsAPI v1.7.0
  [2913bbd2] StatsBase v0.34.3
  [4c63d2b9] StatsFuns v1.3.2
  [892a3eda] StringManipulation v0.4.0
  [09ab397b] StructArrays v0.6.18
  [53d494c1] StructIO v0.3.1
  [3783bdb8] TableTraits v1.0.1
  [bd369af6] Tables v1.12.0
  [1c621080] TestItems v1.0.0
  [a759f4b9] TimerOutputs v0.5.24
  [3bb67fe8] TranscodingStreams v0.11.2
  [28d57a85] Transducers v0.4.82
  [3a884ed6] UnPack v1.0.2
  [1986cc42] Unitful v1.21.0
  [a7773ee8] UnitfulAtomic v1.0.0
  [f31437dd] UnitfulChainRules v0.1.2
  [013be700] UnsafeAtomics v0.2.1
  [d80eeb9a] UnsafeAtomicsLLVM v0.2.1
  [efce3f68] WoodburyMatrices v1.0.0
  [e88e6eb3] Zygote v0.6.71
  [700de1a5] ZygoteRules v0.2.5
⌅ [4ee394cb] CUDA_Driver_jll v0.9.2+0
⌅ [76a88914] CUDA_Runtime_jll v0.14.1+0
  [78a364fa] Chemfiles_jll v0.10.4+0
⌅ [7cc45869] Enzyme_jll v0.0.148+0
  [f5851436] FFTW_jll v3.3.10+1
  [1d5cc7b8] IntelOpenMP_jll v2024.2.1+0
  [9c1d0b0a] JuliaNVTXCallbacks_jll v0.2.1+0
⌅ [dad2f222] LLVMExtra_jll v0.0.31+0
  [94ce4f54] Libiconv_jll v1.17.0+0
  [856f044c] MKL_jll v2024.2.0+0
  [e98f9f5b] NVTX_jll v3.1.0+2
  [efe28fd5] OpenSpecFun_jll v0.5.5+0
  [f50d1b31] Rmath_jll v0.5.1+0
  [02c8fc9c] XML2_jll v2.13.3+0
  [1317d2d5] oneTBB_jll v2021.12.0+0
  [0dad84c5] ArgTools v1.1.1
  [56f22d72] Artifacts
  [2a0f44e3] Base64
  [ade2ca70] Dates
  [8ba89e20] Distributed
  [f43a241f] Downloads v1.6.0
  [7b1f6079] FileWatching
  [9fa8497b] Future
  [b77e0a4c] InteractiveUtils
  [4af54fe1] LazyArtifacts
  [b27032c2] LibCURL v0.6.4
  [76f85450] LibGit2
  [8f399da3] Libdl
  [37e2e46d] LinearAlgebra
  [56ddb016] Logging
  [d6f4376e] Markdown
  [a63ad114] Mmap
  [ca575930] NetworkOptions v1.2.0
  [44cfe95a] Pkg v1.10.0
  [de0858da] Printf
  [3fa0cd96] REPL
  [9a3f8284] Random
  [ea8e919c] SHA v0.7.0
  [9e88b42a] Serialization
  [1a1011a3] SharedArrays
  [6462fe0b] Sockets
  [2f01184e] SparseArrays v1.10.0
  [10745b16] Statistics v1.10.0
  [4607b0f0] SuiteSparse
  [fa267f1f] TOML v1.0.3
  [a4e569a6] Tar v1.10.0
  [8dfed614] Test
  [cf7118a7] UUIDs
  [4ec0a83e] Unicode
  [e66e0078] CompilerSupportLibraries_jll v1.1.0+0
  [deac9b47] LibCURL_jll v8.4.0+0
  [e37daf67] LibGit2_jll v1.6.4+0
  [29816b5a] LibSSH2_jll v1.11.0+1
  [c8ffd9c3] MbedTLS_jll v2.28.2+1
  [14a3606d] MozillaCACerts_jll v2023.1.10
  [4536629a] OpenBLAS_jll v0.3.23+4
  [05823500] OpenLibm_jll v0.8.1+2
  [bea87d4a] SuiteSparse_jll v7.2.1+1
  [83775a58] Zlib_jll v1.2.13+1
  [8e850b90] libblastrampoline_jll v5.8.0+1
  [8e850ede] nghttp2_jll v1.52.0+1
  [3f19e933] p7zip_jll v17.4.0+2
Info Packages marked with ⌃ and ⌅ have new versions available. Those with ⌃ may be upgradable, but those with ⌅ are restricted by compatibility constraints from upgrading. To see why use `status --outdated -m`
vchuravy commented 9 hours ago

Wild... What is Libc.dllist()? Who loads this darn library

leios commented 9 hours ago
julia> Libc.Libdl.dllist()
32-element Vector{String}:
 "linux-vdso.so.1"
 "/usr/lib/libdl.so.2"
 "/usr/lib/libpthread.so.0"
 "/usr/lib/libc.so.6"
 "/home/leios/builds/julia-1.10.2/bin/../lib/libjulia.so.1.10"
 "/lib64/ld-linux-x86-64.so.2"
 "/home/leios/builds/julia-1.10.2/bin/../lib/julia/libgcc_s.so.1"
 "/home/leios/builds/julia-1.10.2/bin/../lib/julia/libopenlibm.so"
 "/usr/lib/libstdc++.so.6"
 "/usr/lib/libm.so.6"
 "/home/leios/builds/julia-1.10.2/bin/../lib/julia/libjulia-internal.so.1.10"
 "/home/leios/builds/julia-1.10.2/bin/../lib/julia/libunwind.so.8"
 "/usr/lib/librt.so.1"
 "/home/leios/builds/julia-1.10.2/bin/../lib/julia/libz.so.1"
 "/home/leios/builds/julia-1.10.2/bin/../lib/julia/libatomic.so.1"
 "/home/leios/builds/julia-1.10.2/bin/../lib/julia/libjulia-codegen.so.1.10"
 "/home/leios/builds/julia-1.10.2/bin/../lib/julia/libLLVM-15jl.so"
 "/home/leios/builds/julia-1.10.2/lib/julia/sys.so"
 "/home/leios/builds/julia-1.10.2/bin/../lib/julia/libpcre2-8.so"
 "/home/leios/builds/julia-1.10.2/bin/../lib/julia/libgmp.so.10"
 "/home/leios/builds/julia-1.10.2/bin/../lib/julia/libmpfr.so.6"
 "/home/leios/builds/julia-1.10.2/bin/../lib/julia/libgfortran.so.5"
 "/home/leios/builds/julia-1.10.2/bin/../lib/julia/libquadmath.so.0"
 "/home/leios/builds/julia-1.10.2/bin/../lib/julia/libopenblas64_.so"
 "/home/leios/builds/julia-1.10.2/bin/../lib/julia/libblastrampoline.so.5"
 "/home/leios/builds/julia-1.10.2/bin/../lib/julia/libmbedcrypto.so.7"
 "/home/leios/builds/julia-1.10.2/bin/../lib/julia/libmbedtls.so.14"
 "/home/leios/builds/julia-1.10.2/bin/../lib/julia/libmbedx509.so.1"
 "/home/leios/builds/julia-1.10.2/bin/../lib/julia/libssh2.so.1"
 "/home/leios/builds/julia-1.10.2/bin/../lib/julia/libgit2.so.1.6"
 "/home/leios/builds/julia-1.10.2/bin/../lib/julia/libnghttp2.so.14"
 "/home/leios/builds/julia-1.10.2/bin/../lib/julia/libcurl.so.4"

Is it a linux thing like libpthread?