JuliaLang / Pkg.jl

Pkg - Package manager for the Julia programming language
https://pkgdocs.julialang.org
Other
619 stars 260 forks source link

Adding packages is thread-unsafe? #2219

Open giordano opened 3 years ago

giordano commented 3 years ago

In Yggdrasil, where we happen to install the same packages in parallel, we have often issues with files that disappear. Last example:

ERROR: LoadError: Error when installing package SuiteSparse32_jll:
IOError: unlink: no such file or directory (ENOENT)
Stacktrace:
 [1] uv_error
   @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/Operations.jl:802 [inlined]
 [9] (::Pkg.Operations.var"#62#65"{Bool, Pkg.Types.Context, Dict{Base.UUID, Vector{String}}, Channel{Any}, Channel{Tuple{Pkg.Types.PackageSpec, String}}})()
   @ Pkg.Operations ./task.jl:395
Stacktrace:
  [1] pkgerror(::String, ::Vararg{String, N} where N)
    @ Pkg.Types /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/Types.jl:52
  [2] macro expansion
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/Operations.jl:819 [inlined]
  [3] macro expansion
    @ ./task.jl:371 [inlined]
  [4] download_source(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}, urls::Dict{Base.UUID, Vector{String}}; readonly::Bool)
    @ Pkg.Operations /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/Operations.jl:775
  [5] #download_source#58
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/Operations.jl:750 [inlined]
  [6] download_source
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/Operations.jl:748 [inlined]
  [7] add(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}, new_git::Vector{Base.UUID}; preserve::Pkg.Types.PreserveLevel, platform::Platform)
    @ Pkg.Operations /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/Operations.jl:1225
  [8] add(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; preserve::Pkg.Types.PreserveLevel, platform::Platform, kwargs::Base.Iterators.Pairs{Symbol, Base.TTY, Tuple{Symbol}, NamedTuple{(:io,), Tuple{Base.TTY}}})
    @ Pkg.API /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/API.jl:194
  [9] (::BinaryBuilderBase.var"#58#64"{Bool, Prefix, Vector{Pkg.Types.PackageSpec}, Platform, Vector{String}})()
    @ BinaryBuilderBase /depot/packages/BinaryBuilderBase/66EAL/src/Prefix.jl:436
 [10] activate(f::BinaryBuilderBase.var"#58#64"{Bool, Prefix, Vector{Pkg.Types.PackageSpec}, Platform, Vector{String}}, new_project::String)
    @ Pkg.API /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Pkg/src/API.jl:1351
 [11] setup_dependencies(prefix::Prefix, dependencies::Vector{Pkg.Types.PackageSpec}, platform::Platform; verbose::Bool)
    @ BinaryBuilderBase /depot/packages/BinaryBuilderBase/66EAL/src/Prefix.jl:429
 [12] autobuild(dir::AbstractString, src_name::AbstractString, src_version::VersionNumber, sources::Vector{var"#s1129"} where var"#s1129"<:BinaryBuilderBase.AbstractSource, script::AbstractString, platforms::Vector{T} where T, products::Vector{var"#s1128"} where var"#s1128"<:Product, dependencies::Vector{var"#s824"} where var"#s824"<:BinaryBuilderBase.AbstractDependency; verbose::Bool, debug::Bool, skip_audit::Bool, ignore_audit_errors::Bool, autofix::Bool, code_dir::Union{Nothing, String}, require_license::Bool, kwargs::Any)
    @ BinaryBuilder /depot/packages/BinaryBuilder/LCVcc/src/AutoBuild.jl:651
 [13] build_tarballs(ARGS::Any, src_name::Any, src_version::Any, sources::Any, script::Any, platforms::Any, products::Any, dependencies::Any; kwargs::Any)
    @ BinaryBuilder /depot/packages/BinaryBuilder/LCVcc/src/AutoBuild.jl:264
 [14] top-level scope
    @ /agent/_work/1/s/S/Sundials/Sundials32@5/build_tarballs.jl:116
in expression starting at /agent/_work/1/s/S/Sundials/Sundials32@5/build_tarballs.jl:116

So Pkg.add deletes files?

StefanKarpinski commented 3 years ago

I don't think any effort has gone into making package operations threadsafe. Would probably make sense to just have a "global package lock" that prevents more than one package operation from being in progress at a time.

fonsp commented 3 years ago

This is an issue for Pluto users, since starting two notebooks at the same time is fairly common. This also means that we can't run notebooks in parallel inside github actions.

I can implement this lock in the future built-in Pkg stuff, but users who manage their environment manually (by calling Pkg.activate) will still be affected.

StefanKarpinski commented 3 years ago

Are the notebooks run in the same process?

fonsp commented 3 years ago

Thanks for pointing that out -- no, it's on separate processes, so my guess is that this issue is present in any form of parallelism on the same file system. Maybe the registry update process?

StefanKarpinski commented 3 years ago

There's two levels of potential synchronization needed: same process and inter-process. In the one process, we can just put a global lock at the entrance to Pkg APIs. Locking between processes is much harder; we could use a mechanism like flock but that doesn't work on all file systems (notoriously not on NFS, iirc). We also don't need to lock between processes all the time. Since many of the things that Pkg installs are immutable, it's often fine if two processes are doing it concurrently as long as we use the pattern of creating a temp version in the same file system and then only move it into place at the last moment. That way one of the two processes "wins" by going last, but it doesn't matter since they both install identical content.

fonsp commented 3 years ago

This package (solves and) references more discussions about this issue: https://github.com/simonbyrne/PkgLock.jl