JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.68k stars 5.48k forks source link

cycle creation issue with new weakdeps code #48533

Closed vtjnash closed 1 year ago

vtjnash commented 1 year ago

Originally posted by me in https://github.com/JuliaLang/julia/issues/48513#issuecomment-1416774301. Now making this an issue, since I think this really will need to be release blocking, as seems like possibly a significant reliability issue and heisenbug that is creeping upon the ecosystem.

Originally, I had that as the implementation and you went via Parent.Other to get dependencies. That however excluded the case where you wanted to load a package in an extension that was not a trigger nor was it loaded into Parent. This seemed to be a quite common desire so I allowed that. -Kristoffer

We observed recently that this should never have been allowed. It would be very useful, if it was valid. But it creates a cycle in the loading graph, which leads to unpredictable deadlocks, and causing the extension to sometimes hang (v1.9) or error (master today). It turns out that we must forbid this (for now), to prevent such unreliable behavior from catching users unawares and making PkgEval unreliable. We can re-evaluate later if we want to design a solution for it to reallow it later. Of note, it also must be forbidden (aka strongly discouraged) from loading those extra, unexpected packages via other mechanisms (e.g. during __init__ or Requires.jl too) for the same reason that they will sometimes cause it to deadlock or error unpredictably.

KristofferC commented 1 year ago

We could identify possible cycles from looking at the environment and warn/error on that?

It's already possible to create cycles in the dependency graph today (which will deadlock) so what is special about this case? Is it the factt that it depends on the order of the packages getting loaded and which package ends up triggering the actual loading of the extension?

vtjnash commented 1 year ago

Yes, with other cycles, it will always deadlock (and now error), which prevents people from doing it previously accidentally. For weakdeps though, now I think it can corrupt the compilecache directory, breaking Pkg.precompile(), and some users who load the package will see errors. What is subtly different here is that we release the package_locks just before loading weakdeps. This obscures that edge, so that it does not get correctly detected by the cycle detection. We did not do that before, since we would run any user code related to the package before dropping that lock / edge, but now we run weakdeps after dropping that lock.

KristofferC commented 1 year ago

I wonder if the problems that we've seen hasn't been from https://github.com/JuliaLang/julia/pull/48558. Are there any known reproducers of issues now with that fixed?

vtjnash commented 1 year ago

Yes, we know conclusively #48535 is required

KristofferC commented 1 year ago

Is there a concrete example that shows this and shows it is worse for extensions over normal packages? It would be good to have to e.g. create a test. The original motivation from Oscar seems to have been fully fixed AFAIU.

If moving the extension into the parent has the same cycle issue, I don't see this as being worse than the normal cycles one can currently create.

So (looking at https://github.com/JuliaLang/julia/pull/48513#issue-1570364168) a cycle C -> A -> Bext -> C is the same as C -> A′ -> C where A′ = [A, Bext] (Bext moved from an extension into a submodule of A) which already gives you an issue.

oscardssmith commented 1 year ago

@KristofferC I think you are right. That said, I still think it is somewhat problematic that package extensions don't have a place to declare dependencies. It is very useful to make C -> A -> Bext -> C work, and it could if we knew ahead of time that Bext needs to load after C (without making Bext require C to trigger).

timholy commented 1 year ago

If you need a test case that triggers this with high reliability, I see this apparently every time with julia --project in .julia/dev/Cthulhu followed by

julia> using Cthulhu
pkg> activate --temp
pkg> add ApproxFun
julia> using ApproxFun
KristofferC commented 1 year ago

Cannot repro on master nor 1.9.0-rc1.

This is on rc1:

~/.julia/dev/Cthulhu.jl master
❯ julia --project 

julia> using Cthulhu

julia> Base.EXT_DORMITORY
Dict{Base.PkgId, Vector{Base.ExtensionId}}()

(Cthulhu) pkg> activate --temp
  Activating new project at `/tmp/jl_0Z0S2o`

(jl_0Z0S2o) pkg> add ApproxFun
    Updating registry at `~/.julia/registries/General.toml`
   Resolving package versions...
    Updating `/tmp/jl_0Z0S2o/Project.toml`
  [28f2ccd6] + ApproxFun v0.13.15
    Updating `/tmp/jl_0Z0S2o/Manifest.toml`
  [621f4979] + AbstractFFTs v1.3.1
...

julia> using ApproxFun

julia> Base.EXT_DORMITORY
Dict{Base.PkgId, Vector{Base.ExtensionId}} with 2 entries:
  InverseFunctions [3587e190-3f89-42d0-90ee-14403ec27112]   => [ExtensionId(LogExpFunctionsInverseFunctionsExt [1e5f9c58-a1…
  ChangesOfVariables [9e997f8a-9a97-42d5-a9f1-ce6bfc15e2c0] => [ExtensionId(LogExpFunctionsChangesOfVariablesExt [d47b78d0-…

julia> filter(x -> endswith(string(x[2]), "Ext"), Base.loaded_modules)
Dict{Base.PkgId, Module} with 4 entries:
  LogExpFunctionsChainRulesCoreExt [1bf5f11d-9a0a-5d25-85d0-d1d9a28a239c]  => LogExpFunctionsChainRulesCoreExt
  AbstractFFTsChainRulesCoreExt [07c0d231-1838-56d6-9ec8-835e5b9b958e]     => AbstractFFTsChainRulesCoreExt
  CompatLinearAlgebraExt [dbe5ba0b-aecc-598a-a867-79051b540f49]            => CompatLinearAlgebraExt
  SpecialFunctionsChainRulesCoreExt [9eb7bdd4-e44c-55fc-b9cc-1a32cb715188] => SpecialFunctionsChainRulesCoreExt

Does https://github.com/JuliaLang/julia/pull/48674 help in your case?

timholy commented 1 year ago

Hmm, interesting, I'm running 1.9rc1 and I got huge lists of

[ Info: Skipping precompilation since __precompile__(false). Importing FastTransforms [057dd010-8810-581a-b7be-e3fc3b93f78c].
[ Info: Precompiling ToeplitzMatrices [c751599d-da0a-543b-9d20-d0a503d91d24]
┌ Warning: Module DSP with build ID ffffffff-ffff-ffff-0000-04e276d461a9 is missing from the cache.
│ This may mean DSP [717857b8-e6f2-59f4-9121-6e50c889abd2] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1752

(It goes on and on and on...)

Perhaps this is quite important:

tim@diva:~/.julia/dev/Cthulhu$ cat LocalPreferences.toml
[SnoopPrecompile]
skip_precompile = ["Cthulhu"]

Another thing that may be surely is very important: I noticed my Cthulhu Project.toml hadn't been resolved in a while:

(Cthulhu) pkg> st
Project Cthulhu v2.8.5
Status `~/.julia/dev/Cthulhu/Project.toml`
⌃ [da1fd8a2] CodeTracking v1.2.1
  [1eca21be] FoldingTrees v1.2.1
  [70703baa] JuliaSyntax v0.3.2
  [21216c6a] Preferences v1.3.0
  [66db9d55] SnoopPrecompile v1.0.3
  [d265eb64] TypedSyntax v1.0.8 `TypedSyntax`
  [b77e0a4c] InteractiveUtils
  [3fa0cd96] REPL
  [cf7118a7] UUIDs
  [4ec0a83e] Unicode
Info Packages marked with ⌃ have new versions available and may be upgradable.

(Cthulhu) pkg> resolve
    Updating `~/.julia/dev/Cthulhu/Project.toml`
  [d265eb64] ~ TypedSyntax v1.0.8 `TypedSyntax` ⇒ v1.0.12 `TypedSyntax`
    Updating `~/.julia/dev/Cthulhu/Manifest.toml`
  [d265eb64] ~ TypedSyntax v1.0.8 `TypedSyntax` ⇒ v1.0.12 `TypedSyntax`

After I did that, the next two times I tried it, I got no such issues.

Does https://github.com/JuliaLang/julia/pull/48674 help in your case?

I haven't yet tried, but I may get to it later today.

KristofferC commented 1 year ago

Hmm, so what made you think it was related to this issue?

timholy commented 1 year ago

A gremlin?

lmiq commented 4 months ago

What happened specifically here?

From this: https://github.com/JuliaLang/julia/issues/48533

It seems that one needs to load dependencies of the extension that are also dependencies of the main package with using, for example:

module Extension
    MainPkg.StaticArrays
    ...
end

Is that the case? There is nothing in the docs mentioning this specifically (not even in the dev docs), and not doing so doesn't raise any error or warnings.