JuliaLang / Pkg.jl

Pkg - Package manager for the Julia programming language
https://pkgdocs.julialang.org
Other
621 stars 261 forks source link

Precompiling when using a single package causes unnecessary precompilation of other packages #3871

Open mkitti opened 5 months ago

mkitti commented 5 months ago
  1. Create an environment with ThreadsX and DataFrames
  2. Remove the files in ~/.julia/compiled/v#.#/DataValueInterfaces
  3. Activate the environment
  4. Invoke using ThreadsX

Invoking using ThreadsX will result in DataFrames precompiling although DataFrames did not need to be precompiled to use ThreadsX.

image

Reproduced when using Julia 1.10.2 and Julia 1.11.0-beta1

mkitti commented 5 months ago
(jar) pkg> st
Status `~/jar/Project.toml`
  [a93c6f00] DataFrames v1.6.1
  [ac1d9e8a] ThreadsX v0.1.12

Project.toml:

[deps]
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
ThreadsX = "ac1d9e8a-700a-412c-b207-f0111f4b6c0d"

] st -m for v1.10:

(jar) pkg> st -m
Status `~/jar/Manifest.toml`
  [7d9f7c33] Accessors v0.1.36
  [79e6a3ab] Adapt v4.0.4
  [dce04be8] ArgCheck v2.3.0
  [198e06fe] BangBang v0.4.1
  [9718e550] Baselet v0.1.1
  [34da2185] Compat v4.14.0
  [a33af91c] CompositionsBase v0.1.2
  [187b0558] ConstructionBase v1.5.5
  [a8cc5b0e] Crayons v4.1.1
  [9a962f9c] DataAPI v1.16.0
  [a93c6f00] DataFrames v1.6.1
  [864edb3b] DataStructures v0.18.19
  [e2d170a0] DataValueInterfaces v1.0.0
  [244e2a9f] DefineSingletons v0.1.2
  [22cec73e] InitialValues v0.3.1
  [842dd82b] InlineStrings v1.4.0
  [3587e190] InverseFunctions v0.1.13
  [41ab1584] InvertedIndices v1.3.0
  [82899510] IteratorInterfaceExtensions v1.0.0
  [b964fa9f] LaTeXStrings v1.3.1
  [1914dd2f] MacroTools v0.5.13
  [128add7d] MicroCollections v0.2.0
  [e1d29d7a] Missings v1.2.0
  [bac558e1] OrderedCollections v1.6.3
  [69de0a69] Parsers v2.8.1
  [2dfb63ee] PooledArrays v1.4.3
  [aea7be01] PrecompileTools v1.2.1
  [21216c6a] Preferences v1.4.3
  [08abe8d2] PrettyTables v2.3.1
  [189a3867] Reexport v1.2.2
  [42d2dcc6] Referenceables v0.1.3
  [ae029012] Requires v1.3.0
  [91c51154] SentinelArrays v1.4.1
  [efcf1570] Setfield v1.1.1
  [a2af1166] SortingAlgorithms v1.2.1
  [171d559e] SplittablesBase v0.1.15
  [1e83bf80] StaticArraysCore v1.4.2
  [892a3eda] StringManipulation v0.3.4
  [3783bdb8] TableTraits v1.0.1
  [bd369af6] Tables v1.11.1
  [ac1d9e8a] ThreadsX v0.1.12
  [28d57a85] Transducers v0.4.81
  [56f22d72] Artifacts
  [2a0f44e3] Base64
  [ade2ca70] Dates
  [8ba89e20] Distributed
  [9fa8497b] Future
  [b77e0a4c] InteractiveUtils
  [8f399da3] Libdl
  [37e2e46d] LinearAlgebra
  [56ddb016] Logging
  [d6f4376e] Markdown
  [de0858da] Printf
  [3fa0cd96] REPL
  [9a3f8284] Random
  [ea8e919c] SHA v0.7.0
  [9e88b42a] Serialization
  [6462fe0b] Sockets
  [2f01184e] SparseArrays v1.10.0
  [10745b16] Statistics v1.10.0
  [fa267f1f] TOML v1.0.3
  [8dfed614] Test
  [cf7118a7] UUIDs
  [4ec0a83e] Unicode
  [e66e0078] CompilerSupportLibraries_jll v1.1.0+0
  [4536629a] OpenBLAS_jll v0.3.23+4
  [bea87d4a] SuiteSparse_jll v7.2.1+1
  [8e850b90] libblastrampoline_jll v5.8.0+1
mkitti commented 5 months ago

Script to reproduce:

using Pkg
envname = "Pkg_jl_PR3871"
mkpath(envname)
Pkg.activate(envname)
Pkg.add(["ThreadsX", "DataFrames"])
rm(joinpath(DEPOT_PATH[1], "compiled", "v$(VERSION.major).$(VERSION.minor)", "DataValueInterfaces"), recursive=true)
run(`$(Base.julia_cmd()) --project=$envname -i -e "using ThreadsX"`)
IanButterworth commented 5 months ago

Can you add the output of st --extensions

mkitti commented 5 months ago
(jar) pkg> st
(jar) pkg> st
Status `~/jar/Project.toml`
  [a93c6f00] DataFrames v1.6.1
  [ac1d9e8a] ThreadsX v0.1.12

(jar) pkg> st --extensions
Status `~/jar/Project.toml`

(jar) pkg> st -m --extensions
Status `~/jar/Manifest.toml`
  [7d9f7c33] Accessors v0.1.36
              ├─ AccessorsIntervalSetsExt [IntervalSets]
              ├─ AccessorsStructArraysExt [StructArrays]
              ├─ AccessorsStaticArraysExt [StaticArrays]
              ├─ AccessorsAxisKeysExt [AxisKeys]
              └─ AccessorsUnitfulExt [Unitful]
  [79e6a3ab] Adapt v4.0.4
              └─ AdaptStaticArraysExt [StaticArrays]
  [198e06fe] BangBang v0.4.1
              ├─ BangBangChainRulesCoreExt [ChainRulesCore]
              ├─ BangBangStaticArraysExt [StaticArrays]
              ├─ BangBangTypedTablesExt [TypedTables]
              ├─ BangBangStructArraysExt [StructArrays]
              ├─ BangBangDataFramesExt [DataFrames]
              └─ BangBangTablesExt [Tables]
  [34da2185] Compat v4.14.0
              └─ CompatLinearAlgebraExt [LinearAlgebra]
  [a33af91c] CompositionsBase v0.1.2
              └─ CompositionsBaseInverseFunctionsExt [InverseFunctions]
  [187b0558] ConstructionBase v1.5.5
              ├─ ConstructionBaseStaticArraysExt [StaticArrays]
              └─ ConstructionBaseIntervalSetsExt [IntervalSets]
  [3587e190] InverseFunctions v0.1.13
              └─ DatesExt [Dates]
  [28d57a85] Transducers v0.4.81
              ├─ TransducersDataFramesExt [DataFrames]
              ├─ TransducersOnlineStatsBaseExt [OnlineStatsBase]
              ├─ TransducersBlockArraysExt [BlockArrays]
              ├─ TransducersReferenceablesExt [Referenceables]
              └─ TransducersLazyArraysExt [LazyArrays]
IanButterworth commented 5 months ago

So this is happening because ThreadsX depends on Transducers which has TransducersDataFramesExt, and when we go to collect all the deps of ThreadsX we include any extensions that could be loaded in the env.

We could narrow it to just extensions that would be loaded based on the current loaded modules. So if DataFrames isn't currently loaded, don't precompile TransducersDataFramesExt which would mean DataFrames wouldn't be added to the list.