JuliaGraphs / Graphs.jl

An optimized graphs package for the Julia programming language
http://juliagraphs.org/Graphs.jl/
Other
468 stars 93 forks source link

Remove dependence on `StaticArrays` #244

Open jlapeyre opened 1 year ago

jlapeyre commented 1 year ago

Almost 80% of the time spent loading Graphs.jl is for loading StaticArrays.

Manifest.toml shows that the dependency path is QuantumDAGs $\to$ Graphs $\to$ ArnoldiMethod $\to$ StaticArrays.

ArnoldiMethod is imported into Graphs with using ArnoldiMethod so it's hard to track down where it is used (There is probably an easy way to find which symbols have been imported from a module, but I don't recall.) I tried commenting out using Arnoldi in src/Graphs.jl, but all tests still pass. I did find uses of partialeigen, partialschur, and LM.

I guess that either ArnoldiMethod could be made a weak dependency of Graphs, or StaticArrays could be made a weak dependency of ArnoldiMethod. [EDIT: This is speculation, they could be used internally for performance]

This is breaking unfortunately so it would have to wait for version 2.0. ... OTOH Maybe keeping ArnoldiMethod a hard dependency, but autoloading it would be possible.

 @time @time_imports using Graphs
     12.2 ms  MacroTools
      2.1 ms  SimpleTraits
      2.6 ms  StaticArraysCore
    575.4 ms  StaticArrays
      3.1 ms  ArnoldiMethod
      0.1 ms  Compat
      0.1 ms  Compat → CompatLinearAlgebraExt
      0.5 ms  Inflate
      7.0 ms  OrderedCollections
     67.2 ms  DataStructures
     37.7 ms  Graphs
  0.741944 seconds (1.68 M allocations: 124.559 MiB, 9.88% gc time, 2.79% compilation time)

julia> 575  / 741
0.7759784075573549
simonschoelly commented 1 year ago

Interesting - as far as I know, ArnoldiMethod is mostly used here for spectral methods - i.e. calculating the eigenvalues of adjacency and laplacian matrices. If we had another (maybe more light weight) package for calculating eigenvalues of sparse matrices, then maybe we could replace the dependency.

Or we could also ask upstream (i.e. in the StaticArrays.jl repository) if there was a way to reduce the loading time.

But I am also wondering if the loading time of this package is really that bad - do you have a use case, where you would wish for faster loading times?

jlapeyre commented 1 year ago

I have an application that has quick loading dependencies. So this indirect dependency on StaticArrays, which may well be unnecessary, is still about 80% of the load time. But I'm not concerned here about my particular application, rather the entire Julia ecosystem. Load times are a big pain point in Julia and Graphs is a very common dependency. One might think 1/2 s is not so bad. But the load times for packages accumulate. As a result people have made a great effort to reduce load times. One might also think "but load times are getting better". But they are getting better precisely because people have been identifying and attacking issues such as this one.

One question is whether ArnoldiMethods is using StaticArrays in a subcomputation to increase performance of calculations on Matrix (dynamically allocated) input. Or is it using StaticArrays to support input of types in StaticArrays. I suspect it's the latter. In the latter case it's easier to separate this specialized code.

Fortunately, Graphs does not expose anything in ArnoldiMethods, so it may be possible to replace it without breaking the API.

Or we could also ask upstream (i.e. in the StaticArrays.jl repository) if there was a way to reduce the loading time.

StaticArrays has been a major offender in slow load times (the 1/2 s time I report is for Julia 1.9, which is a vast improvement over the past few point releases) There has been a lot of discussion and work on trying to deal with this. There are ways to reduce the loading time. For example StaticArraysCore.jl provides the definitions and loads in 3ms (on my platform). But ArnolidMethods does not take advantage of this. There is also the new [extensions] feature in Julia 1.9.

thchr commented 1 year ago

I looked at this before with the same idea: unfortunately, the StaticArrays.jl dependency of ArnoldiMethod.jl does not just appear to be a simple extension (and so cannot just be converted to a weak dependency). Specifically, it seems ArnoldiMethod.jl is using SMatrixes as an optimization in the generic partialschur function.

One example call chain is like this: partialschur(A; ...)_partialschursortschur!swap! → {swap12!, swap22!, etc.} → @SMatrix → linear algebra.

In principle, this optimization could be dropped in ArnoldiMethod.jl (presumably with a performance loss, I don't know) or the relevant bits of StaticArrays.jl that are used could be copied over to ArnoldiMethod.jl. Both those options appear quite bad though: e.g., in the latter case, it would entail a bunch of code-duplication and reduce maintainability. [StaticArraysCore.jl is not an option since it essentially just contains type definitions]

I think the best way to reduce the impact of sluggish StaticArrays.jl loading time is to improve the StaticArrays.jl loading time. There's some discussion in https://github.com/JuliaArrays/StaticArrays.jl/issues/1074 but it is not my impression that there was ever a very concerted effort to improve things. It would e.g. be interesting to see which parts of the StaticArrays.jl code is responsible for the majority of loading time.

jlapeyre commented 1 year ago

the StaticArrays.jl dependency of ArnoldiMethod.jl does not just appear to be a simple extension

I was afraid of that.

Regarding that issue, it looks like some invalidations disappeared do to external changes. But no one posted new snoop compile results. EDIT: I just checked, there are still a lot of invalidations.

julia> using SnoopCompileCore;

julia> invalidations = @snoopr using StaticArrays;

julia> using SnoopCompile;

julia> length(invalidations)
1153

julia> length(invalidation_trees(invalidations))
12

The load time of StaticArrays is much better than it was a year or so ago. Just now on v1.9-rc2 I get about 0.35s load time.

But, if StaticArrays were removed, Graphs.jl would load in about 0.1s. This would make it a much lighter-weight dependency.

jlapeyre commented 1 year ago

I think the best way to reduce the impact of sluggish StaticArrays.jl loading time is to improve the StaticArrays.jl loading time.

I opened this issue with the assumption that we are stuck with slow load times for StaticArrays.

It seems that StaticArrays.jl is very much on the radar for people trying to reduce latencies in Julia. I noticed that @thchr has been involved in discussion of attempts to fix this on slack. Although it may take some time, I now guess that the latency associated with StaticArrays will be greatly reduced some time before the very distant future.