JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.56k stars 5.47k forks source link

v1.11 generates 50% larger cache files #53570

Open joa-quim opened 7 months ago

joa-quim commented 7 months ago

THe compile time is quite variable but the differences are comparable to this.

  | | |_| | | | (_| |  |  Version 1.10.2 (2024-03-01)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> cd("c:/v"); @time using GMT
Precompiling GMT
  1 dependency successfully precompiled in 48 seconds. 87 already precompiled.
 49.657821 seconds (4.51 M allocations: 328.234 MiB, 0.21% gc time, 1.84% compilation time)
  | | |_| | | | (_| |  |  Version 1.11.0-alpha1 (2024-03-01)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> cd("c:/v"); @time using GMT
Precompiling GMT
  1 dependency successfully precompiled in 60 seconds. 112 already precompiled.
 61.731323 seconds (4.22 M allocations: 263.147 MiB, 0.35% gc time, 1.69% compilation time: 15% of which was recompilation)

v1.11 cache file -> ~89.5 MB v1.10 -> ~59.5 Mb

Load times

  | | |_| | | | (_| |  |  Version 1.10.2 (2024-03-01)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> @time_imports using GMT
               ┌ 2.7 ms SuiteSparse_jll.__init__()
     25.6 ms  SuiteSparse_jll 85.64% compilation time
               ┌ 5.0 ms SparseArrays.CHOLMOD.__init__() 98.93% compilation time
    123.1 ms  SparseArrays 3.99% compilation time
      0.7 ms  Statistics
      0.2 ms  DataValueInterfaces
      0.6 ms  DataAPI
      0.2 ms  IteratorInterfaceExtensions
      0.2 ms  TableTraits
      6.3 ms  Tables
      0.2 ms  Reexport
     12.1 ms  Preferences
      0.3 ms  PrecompileTools
      5.7 ms  StringManipulation
     10.3 ms  Crayons
      0.6 ms  LaTeXStrings
     63.7 ms  PrettyTables
               ┌ 17.4 ms GMT.Gdal.__init__()
               ├ 16.6 ms GMT.__init__()
    319.5 ms  GMT
  | | |_| | | | (_| |  |  Version 1.11.0-alpha1 (2024-03-01)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> @time_imports using GMT
      0.7 ms  Statistics
      0.3 ms  DataValueInterfaces
      0.5 ms  DataAPI
      0.2 ms  IteratorInterfaceExtensions
      0.2 ms  TableTraits
      6.8 ms  Tables
      0.4 ms  Reexport
      8.9 ms  Preferences
      0.4 ms  PrecompileTools
      5.8 ms  StringManipulation
     11.7 ms  Crayons
      0.6 ms  LaTeXStrings
     69.4 ms  PrettyTables
               ┌ 18.5 ms GMT.Gdal.__init__()
               ├ 108.3 ms GMT.__init__() 88.15% compilation time (100% recompilation)
    814.4 ms  GMT 54.63% compilation time (59% recompilation)
JeffBezanson commented 7 months ago

Looks like, partly, new invalidations in GMT.__init__()? (and therefore possibly in other things too)

JeffBezanson commented 7 months ago

Maybe similar to #53511 ?

KristofferC commented 5 months ago

Would be interesting to put back all stdlibs into the sysimage and re-time this to see if the effect is purely for moving out stdlibs or if there are other reasons as well.

mkitti commented 5 months ago

Would be interesting to put back all stdlibs into the sysimage and re-time this to see if the effect is purely for moving out stdlibs or if there are other reasons as well.

We should consider building this and releasing it as an artifact. It would be using at least for testing and some loading sensitive applications as well.

jaakkor2 commented 5 months ago

Worst offender in the precompiled size I have seen is https://github.com/Gnimuc/GLTF.jl. v1.11.0-beta1 (211 MB) is about 5x bigger than v1.10.3 (43 MB).

KristofferC commented 5 months ago

As you showed in https://github.com/quinnj/JSON3.jl/issues/279, the issue there seems to be egregious use of @inline on large functions. It isn't obvious why that would change between 1.10 to 1.11 but one reason could be that we are better at precompiling now so more of the (very big due to `@inline) functions get saved to the image file.

KristofferC commented 5 months ago

This seems to be more or less fixed on 1.11 backport branch. This is using the master branch of GMT.jl:

julia> @time using GMT
  0.638436 seconds (668.47 k allocations: 51.051 MiB, 5.54% gc time, 2.08% compilation time)

julia> VERSION
v"1.10.3"
julia> @time using GMT
  0.657463 seconds (730.44 k allocations: 49.958 MiB, 1.27% gc time, 1.48% compilation time)

julia> VERSION
v"1.11.0-beta1.40"
joa-quim commented 5 months ago

But the precompiled image is still ~50% larger.

mkitti commented 5 months ago

Try https://github.com/timholy/PkgCacheInspector.jl

joa-quim commented 5 months ago

Try https://github.com/timholy/PkgCacheInspector.jl

I did some time ago ans spent quite some effort trying to improve the situation (forgot many of the details). But now I can't even use that package anymore.

  | | |_| | | | (_| |  |  Version 1.10.0 (2023-12-25)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> cd("c:/v"); @time using GMT
Precompiling GMT
  1 dependency successfully precompiled in 44 seconds. 87 already precompiled.
 46.397674 seconds (4.63 M allocations: 340.133 MiB, 0.28% gc time, 1.79% compilation time)

julia> using PkgCacheInspector

julia> info_cachefile("GMT")
ERROR: Error reading package image file.
  | | |_| | | | (_| |  |  Version 1.11.0-beta1 (2024-04-10)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using PkgCacheInspector

julia> info_cachefile("GMT")
ERROR: MethodError: no method matching parse_cache_header(::IOStream)
The function `parse_cache_header` exists, but no method is defined for this combination of argument types.

But note that this cache size issue is common to other packages. For example Makie cache is 60% larger in 1.11 vs 1.10

fatteneder commented 5 months ago

That last error is due to #49866 in which the signature of an internal method was changed, which is used by PkgCacheInspector.jl. Should be straightforward to fix, will make a PR later.

KristofferC commented 5 months ago

1.11:

Contents of /Users/kristoffercarlsson/.julia/compiled/v1.11/GMT/EoU0j_u25Qm.dylib:
  modules: Any[GMT.Gdal, GMT.Drawing, GMT]
  init order: Any[GMT.Gdal, GMT]
  1503 external methods
  22730 new specializations of external methods (Base 80.0%, Base.Broadcast 14.5%, Base.Iterators 2.5%, ...)
  1371 external methods with new roots
  36342 external targets
  28454 edges
  file size:   92678656 (88.385 MiB)
  Segment sizes (bytes):
  system:      25084036 ( 29.27%)
  isbits:      57449436 ( 67.04%)
  symbols:       149474 (  0.17%)
  tags:          410413 (  0.48%)
  relocations:  2523569 (  2.94%)
  gvars:          44312 (  0.05%)
  fptrs:          33536 (  0.04%)

1.10:

Contents of /Users/kristoffercarlsson/.julia/compiled/v1.10/GMT/EoU0j_Vg0I0.dylib:
  modules: Any[GMT.Gdal, GMT.Drawing, GMT]
  init order: Any[GMT.Gdal, GMT]
  1503 external methods
  40014 new specializations of external methods (Base 72.8%, Base.Broadcast 14.0%, GMT 5.9%, ...)
  1157 external methods with new roots
  31428 external targets
  24264 edges
  file size:   56215744 (53.612 MiB)
  Segment sizes (bytes):
  system:      18509732 ( 36.03%)
  isbits:      30370140 ( 59.12%)
  symbols:       150178 (  0.29%)
  tags:          271567 (  0.53%)
  relocations:  1986075 (  3.87%)
  gvars:          50072 (  0.10%)
  fptrs:          30080 (  0.06%)

Haven't done any more analysis than that. Seems a bit strange that 1.11 has way fewer specializations but still larger file size.

fatteneder commented 5 months ago

Try https://github.com/timholy/PkgCacheInspector.jl

I did some time ago ans spent quite some effort trying to improve the situation (forgot many of the details). But now I can't even use that package anymore.

@joa-quim Please update PkgCacheInspector.jl to v1.0.1 and try again. https://github.com/JuliaRegistries/General/pull/106291

joa-quim commented 5 months ago

Thanks. It works now ... but only once. Anyway, I don't know how to use the info to understand what makes the cache file so big.

julia> using PkgCacheInspector

julia> x = info_cachefile("GMT")
Contents of C:\Users\j\.julia\compiled\v1.11\GMT\EoU0j_tYaDV.dll:
  modules: Any[GMT.Gdal, GMT.Drawing, GMT]
  init order: Any[GMT.Gdal, GMT]
  1503 external methods
  22890 new specializations of external methods (Base 80.4%, Base.Broadcast 14.0%, Base.Iterators 2.5%, ...)
  1384 external methods with new roots
  36769 external targets
  28713 edges
  file size:   102150656 (97.418 MiB)
  Segment sizes (bytes):
  system:      25201556 ( 29.22%)
  isbits:      57871532 ( 67.10%)
  symbols:       147304 (  0.17%)
  tags:          412695 (  0.48%)
  relocations:  2534014 (  2.94%)
  gvars:          44560 (  0.05%)
  fptrs:          33416 (  0.04%)

julia> x = info_cachefile("GMT")
ERROR: Error reading package image file.