JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.66k stars 5.48k forks source link

Regression in startup time to full REPL #56268

Open knuesel opened 5 days ago

knuesel commented 5 days ago

The startup time to show a full REPL has significantly increased from 1.10 to 1.11.1:

$ time julia +1.10 --startup-file=no -e "using REPL; exit()"

________________________________________________________
Executed in  159.42 millis    fish           external
   usr time  101.52 millis    0.73 millis  100.79 millis
   sys time   63.40 millis    1.10 millis   62.30 millis

vs

$ time julia +1.11.1 --startup-file=no -e "using REPL; exit()"

________________________________________________________
Executed in  371.30 millis    fish           external
   usr time  638.88 millis    0.38 millis  638.50 millis
   sys time   81.87 millis    1.84 millis   80.03 millis

It's of course not a big problem, but I find the extra 200ms more annoying than expected. REPL startup used to be very snappy, now it feels a bit sluggish.

This is probably due to the REPL being out of the sysimage now. Related issues: https://github.com/julialang/julia/issues/56063 and https://github.com/julialang/julia/issues/51532

Could the latency be improved by adding a precompile statement somewhere? Currently I see this:

$ julia +1.11.1 --startup-file=no --trace-compile=stderr
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.11.1 (2024-10-16)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> <Ctrl+D>
precompile(Tuple{typeof(Base.print), Base.TTY, String})

(no precompilation is reported when doing the same with Julia 1.10)

My setup:

julia> versioninfo()
Julia Version 1.11.1
Commit 8f5b7ca12ad (2024-10-16 10:53 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 8 × Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, skylake)
Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores)
KristofferC commented 5 days ago

Could the latency be improved by adding a precompile statement somewhere?

Precompile statements improve execution speed, what you are measuring is loading speed.

IanButterworth commented 5 days ago

For completeness can you run this (which looks like this on my m2 macbook)

% time julia +1.11 --startup-file=no -e "using InteractiveUtils; @time @time_imports using REPL"
     33.4 ms  StyledStrings
      0.5 ms  Unicode
               ┌ 0.1 ms REPL.REPLCompletions.__init__()
               ├ 0.0 ms REPL.TerminalMenus.__init__()
               ├ 0.0 ms REPL.__init__()
     93.8 ms  REPL
  0.155336 seconds (220.77 k allocations: 13.221 MiB)
julia +1.11 --startup-file=no -e   1.16s user 0.27s system 387% cpu 0.368 total
knuesel commented 5 days ago

@KristofferC I'm confused... --trace-compile is documented as printing "precompile statements for methods compiled during execution". So the output precompile(Tuple{typeof(Base.print), Base.TTY, String}) means that Julia had to compile this method while loading my session. Presumably, it would load faster if the method was precompiled instead of being compiled during each startup?

@IanButterworth this is what I get:

$ time julia +1.11.1 --startup-file=no -e "using InteractiveUtils; @time @time_imports using REPL"
     44.2 ms  StyledStrings
      0.3 ms  Unicode
               ┌ 0.0 ms REPL.REPLCompletions.__init__()
               ├ 0.0 ms REPL.TerminalMenus.__init__()
               ├ 0.0 ms REPL.__init__()
    264.1 ms  REPL 40.88% compilation time
  0.348576 seconds (496.07 k allocations: 27.602 MiB, 40.24% compilation time)

________________________________________________________
Executed in  564.51 millis    fish           external
   usr time  817.70 millis    1.17 millis  816.52 millis
   sys time   96.13 millis    0.03 millis   96.11 millis

(I get the same when trying with 1.11 instead of 1.11.1)

IanButterworth commented 5 days ago

A lot of it does seem to be compilation time

264.1 ms  REPL 40.88% compilation time
KristofferC commented 5 days ago

So the output precompile(Tuple{typeof(Base.print), Base.TTY, String}) means that Julia had to compile this method while loading my session. Presumably, it would load faster if the method was precompiled instead of being compiled during each startup?

How much time does it take to compile that method?

knuesel commented 5 days ago

How much time does it take to compile that method?

Is there a better way to estimate this than the above? The >200 ms compilation time reported above seems to account for all the extra delay that I observe (edit: but only 40% of that is compilation).

IanButterworth commented 5 days ago

Can you try this with julia nightly (note the @trace_compile)

% time julia +nightly --startup-file=no -e "using InteractiveUtils; @time @time_imports @trace_compile using REPL"
    116.5 ms  Unicode
               ┌ 0.0 ms REPL.REPLCompletions.__init__()
               ├ 0.0 ms REPL.__init__()
    392.9 ms  REPL
  0.530033 seconds (164.78 k allocations: 9.284 MiB)
julia +nightly --startup-file=no -e   1.23s user 0.73s system 84% cpu 2.312 total
KristofferC commented 5 days ago

Is there a better way to estimate this than the above?

C:\Users\Kristoffer>julia +1.11 -e "@time precompile(Tuple{typeof(Base.print), Base.TTY, String})"
  0.002301 seconds (72 allocations: 2.859 KiB, 99.63% compilation time)
knuesel commented 5 days ago
$ time julia +nightly --startup-file=no -e "using InteractiveUtils; @time @time_imports @trace_compile using REPL"
#=    9.5 ms =# precompile(Tuple{typeof(Base.print), Base.TTY, String, String})
      0.2 ms  #=    3.1 ms =# precompile(Tuple{typeof(Base.print), Base.TTY, String})
Unicode#=   21.3 ms =# precompile(Tuple{typeof(Base.println), Base.TTY})

#=   71.2 ms =# precompile(Tuple{typeof(Core.kwcall), NamedTuple{(:bold, :italic, :underline, :blink, :reverse, :hidden, :color), Tuple{Bool, Bool, Bool, Bool, Bool, Bool, Symbol}}, typeof(Base.printstyled), Base.TTY, String})
               ┌ 0.0 ms REPL.REPLCompletions.__init__()
               ├ 0.0 ms REPL.__init__()
    273.7 ms  REPL 39.74% compilation time
  0.320709 seconds (454.99 k allocations: 24.627 MiB, 46.11% compilation time: 2% of which was recompilation)

________________________________________________________
Executed in  733.67 millis    fish           external
   usr time  973.57 millis    0.00 millis  973.57 millis
   sys time  108.50 millis    1.38 millis  107.11 millis

And running @KristofferC's command:

$ julia +1.11 -e "@time precompile(Tuple{typeof(Base.print), Base.TTY, String})"
  0.003671 seconds (72 allocations: 2.859 KiB, 99.57% compilation time)
PallHaraldsson commented 4 days ago

Does the have to do with this new REPL file https://github.com/JuliaLang/julia/blob/master/stdlib/REPL/src/Pkg_beforeload.jl

Note:

julia> @time using Pkg
  0.661828 seconds (414.84 k allocations: 28.534 MiB, 12.49% gc time, 0.92% compilation time)

vs in 1.10.5:
julia> @time using Pkg
  0.000432 seconds (494 allocations: 50.562 KiB)

..
julia> @time load_pkg()
  0.912341 seconds (562.16 k allocations: 34.953 MiB, 6.57% gc time, 0.71% compilation time)
REPLExt
IanButterworth commented 4 days ago

No, Pkg isn't loaded during startup

timholy commented 2 days ago

but only 40% of that is compilation

The remaining 60% is probably dominated by invalidation-checks upon loading. (When things are baked into the sysimage, you don't have to run those checks.) If you run a full profiler you should see much of the time spent in C functions like jl_verify_edges (in src/staticdata_utils.c).

knuesel commented 2 days ago

The remaining 60% is probably dominated by invalidation-checks upon loading

That seems right: running julia +1.11.1 --startup-file=no -e "using Profile; using InteractiveUtils; @profile using REPL; Profile.print(C=true, noisefloor=2)" gives the following profile:

Click to expand
Overhead ╎ [+additional indent] Count File:Line; Function
=========================================================
 ╎201 …pace/srcdir/glibc-2.17/csu/../sysdeps/x86_64/start.S:123;
 ╎ 201 /usr/lib/x86_64-linux-gnu/libc.so.6:?; __libc_start_main
 ╎  201 /usr/lib/x86_64-linux-gnu/libc.so.6:?;
 ╎   201 …-demeter6-6/julialang/julia-master/cli/loader_exe.c:58; main
 ╎    201 …lder-demeter6-6/julialang/julia-master/src/jlapi.c:1059; jl_repl_entrypoint
 ╎     201 …lder-demeter6-6/julialang/julia-master/src/jlapi.c:900; true_main
 ╎    ╎ 201 …der-demeter6-6/julialang/julia-master/src/julia.h:2157; jl_apply
 ╎    ╎  201 …iaup/julia-1.11.1+0.x64.linux.gnu/lib/julia/sys.so:?; jfptr__start_72144.1
 ╎    ╎   201 @Base/client.jl:531; _start()
 ╎    ╎    201 @Base/client.jl:296; exec_options(opts::Base.JLOptions)
 ╎    ╎     201 @Base/boot.jl:430; eval
 ╎    ╎    ╎ 201 …demeter6-6/julialang/julia-master/src/toplevel.c:994; ijl_toplevel_eval_in
 ╎    ╎    ╎  201 …emeter6-6/julialang/julia-master/src/toplevel.c:886; jl_toplevel_eval_flex
 ╎    ╎    ╎   201 …emeter6-6/julialang/julia-master/src/toplevel.c:886; jl_toplevel_eval_flex
 ╎    ╎    ╎    201 …emeter6-6/julialang/julia-master/src/toplevel.c:943; jl_toplevel_eval_flex
 ╎    ╎    ╎     201 …er6-6/julialang/julia-master/src/interpreter.c:821; jl_interpret_toplevel_thunk
 ╎    ╎    ╎    ╎ 197 …er6-6/julialang/julia-master/src/interpreter.c:539; eval_body
 ╎    ╎    ╎    ╎  197 …r6-6/julialang/julia-master/src/interpreter.c:629; eval_body
 ╎    ╎    ╎    ╎   197 …eter6-6/julialang/julia-master/src/toplevel.c:759; jl_toplevel_eval_flex
 ╎    ╎    ╎    ╎    197 …eter6-6/julialang/julia-master/src/toplevel.c:523; eval_import_path
 ╎    ╎    ╎    ╎     197 …ter6-6/julialang/julia-master/src/toplevel.c:486; call_require
 ╎    ╎    ╎    ╎    ╎ 197 …meter6-6/julialang/julia-master/src/julia.h:2157; jl_apply
 ╎    ╎    ╎    ╎    ╎  197 …ulia-1.11.1+0.x64.linux.gnu/lib/julia/sys.so:?; jfptr_require_69803.1
 ╎    ╎    ╎    ╎    ╎   197 @Base/loading.jl:2191; require(into::Module, mod::Symbol)
 ╎    ╎    ╎    ╎    ╎    197 @Base/essentials.jl:1086; invoke_in_world
 ╎    ╎    ╎    ╎    ╎     197 @Base/essentials.jl:1089; #invoke_in_world#3
 ╎    ╎    ╎    ╎    ╎    ╎ 197 …r6-6/julialang/julia-master/src/builtins.c:894; jl_f__call_in_world
 ╎    ╎    ╎    ╎    ╎    ╎  197 …ter6-6/julialang/julia-master/src/julia.h:2157; jl_apply
 ╎    ╎    ╎    ╎    ╎    ╎   197 …ia-1.11.1+0.x64.linux.gnu/lib/julia/sys.so:?; jfptr___require_69814.1
 ╎    ╎    ╎    ╎    ╎    ╎    197 @Base/loading.jl:2198; __require(into::Module, mod::Symbol)
 ╎    ╎    ╎    ╎    ╎    ╎     197 @Base/lock.jl:273; macro expansion
 ╎    ╎    ╎    ╎    ╎    ╎    ╎ 197 @Base/loading.jl:2241; macro expansion
 ╎    ╎    ╎    ╎    ╎    ╎    ╎  197 @Base/loading.jl:2302; _require_prelocked(uuidkey::Base.PkgId, env::String)
 ╎    ╎    ╎    ╎    ╎    ╎    ╎   197 @Base/essentials.jl:1086; invoke_in_world
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    197 @Base/essentials.jl:1089; #invoke_in_world#3
 ╎    ╎    ╎    ╎    ╎    ╎    ╎     197 …/julialang/julia-master/src/builtins.c:894; jl_f__call_in_world
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 197 …-6/julialang/julia-master/src/julia.h:2157; jl_apply
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  197 ….11.1+0.x64.linux.gnu/lib/julia/sys.so:?; jfptr___require_prelocked_69877.1
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   197 @Base/loading.jl:2315; __require_prelocked(uuidkey::Base.PkgId, env::String)
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    197 @Base/loading.jl:2450; _require(pkg::Base.PkgId, env::String)
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     197 …11.1+0.x64.linux.gnu/lib/julia/sys.so:?; jfptr__require_search_from_serialized_44533.1
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 197 @Base/loading.jl:1908; kwcall(::@NamedTuple{reasons::Dict{String, Int64}}, ::typeof(Base._require_search_…
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  44  @Base/loading.jl:1969; _require_search_from_serialized(pkg::Base.PkgId, sourcepath::String, build_id::UI…
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   44  @Base/loading.jl:1169; _include_from_serialized
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    44  @Base/loading.jl:1169; _include_from_serialized
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     44  @Base/loading.jl:1204; _include_from_serialized(pkg::Base.PkgId, path::String, ocachepath::String, de…
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 44  …ang/julia-master/src/staticdata.c:3792; ijl_restore_package_image_from_file
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  44  …ang/julia-master/src/staticdata.c:3701; jl_restore_incremental_from_buf
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   35  …ang/julia-master/src/staticdata.c:3661; jl_restore_package_image_from_stream
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    30  …ia-master/src/staticdata_utils.c:1197; jl_insert_backedges
2╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     20  …ia-master/src/staticdata_utils.c:1005; jl_verify_edges
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 18  …julialang/julia-master/src/gf.c:2336; ijl_matching_methods
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  18  …julialang/julia-master/src/gf.c:3639; ml_matches
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   11  …julialang/julia-master/src/gf.c:3724; ml_matches
1╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    11  …ulialang/julia-master/src/gf.c:3347; ml_mtable_visitor
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  145 @Base/loading.jl:1985; _require_search_from_serialized(pkg::Base.PkgId, sourcepath::String, build_id::UI…
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   145 @Base/loading.jl:1169; _include_from_serialized
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    145 @Base/loading.jl:1169; _include_from_serialized
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     145 @Base/loading.jl:1204; _include_from_serialized(pkg::Base.PkgId, path::String, ocachepath::String, de…
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 145 …ang/julia-master/src/staticdata.c:3792; ijl_restore_package_image_from_file
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  145 …ang/julia-master/src/staticdata.c:3701; jl_restore_incremental_from_buf
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   125 …ang/julia-master/src/staticdata.c:3661; jl_restore_package_image_from_stream
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    99  …ia-master/src/staticdata_utils.c:1197; jl_insert_backedges
1╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     32  …ia-master/src/staticdata_utils.c:996; jl_verify_edges
1╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 23  …lang/julia-master/src/subtype.c:4348; jl_type_intersection_env_s
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  14  …lang/julia-master/src/subtype.c:2148; ijl_subtype_env
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   14  …lang/julia-master/src/subtype.c:1698; forall_exists_subtype
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    14  …ang/julia-master/src/subtype.c:1684; _forall_exists_subtype
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     14  …ang/julia-master/src/subtype.c:1653; exists_subtype
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 7   …ng/julia-master/src/subtype.c:1425; subtype
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  7   …ng/julia-master/src/subtype.c:913; subtype_unionall
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   6   …ng/julia-master/src/subtype.c:1462; subtype
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    6   …g/julia-master/src/subtype.c:1302; subtype_tuple
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     4   …g/julia-master/src/subtype.c:1223; subtype_tuple_tail
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 7   …ng/julia-master/src/subtype.c:1462; subtype
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  7   …ng/julia-master/src/subtype.c:1302; subtype_tuple
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   5   …ng/julia-master/src/subtype.c:1220; subtype_tuple_tail
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    4   …g/julia-master/src/subtype.c:2148; ijl_subtype_env
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     4   …g/julia-master/src/subtype.c:1698; forall_exists_subtype
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 4   …/julia-master/src/subtype.c:1684; _forall_exists_subtype
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  4   …/julia-master/src/subtype.c:1653; exists_subtype
1╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     65  …ia-master/src/staticdata_utils.c:1005; jl_verify_edges
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 64  …julialang/julia-master/src/gf.c:2336; ijl_matching_methods
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  64  …julialang/julia-master/src/gf.c:3639; ml_matches
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   35  …julialang/julia-master/src/gf.c:3724; ml_matches
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    35  …ulialang/julia-master/src/gf.c:3347; ml_mtable_visitor
 ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     13  …ang/julia-master/src/typemap.c:770; jl_typemap_intersection_visitor
 ╎3   [unknown stackframe]
Total snapshots: 204. Utilization: 100% across all threads and tasks. Use the `groupby` kwarg to break down by thread and/or task.

That's 117 counts in jl_verify_edges out of 201 (so 58%).

Something I don't understand: how come we don't all see the compilation happening when running julia +1.11.1 --startup-file=no -e "using InteractiveUtils; @time @time_imports using REPL" ? Why would it depend on a particular installation if we use the same Julia version?

KristofferC commented 2 days ago

It is strange to me how in

❯ julia +1.11.1  --startup-file=no -e "using InteractiveUtils; @time_imports @time using REPL" 
     30.6 ms  StyledStrings
      0.2 ms  Unicode
               ┌ 0.0 ms REPL.REPLCompletions.__init__() 
               ├ 0.0 ms REPL.TerminalMenus.__init__() 
               ├ 0.0 ms REPL.__init__() 
    143.7 ms  REPL 41.01% compilation time
  0.197839 seconds (499.18 k allocations: 27.652 MiB, 38.06% compilation time)

❯ julia +1.11.1  --startup-file=no -e "using InteractiveUtils; @time using REPL"
  0.121936 seconds (220.31 k allocations: 13.086 MiB)

the @time macro reports different compilation time (and different load time) depending on if @time_imports is used or not

IanButterworth commented 2 days ago

I suspect some TTY print precompiles aren't sticking for some people. I'd investigate but I'm not one of those people..

% julia +1.11  --startup-file=no -e "using InteractiveUtils; @time_imports @time using REPL"
     35.3 ms  StyledStrings
      0.5 ms  Unicode
               ┌ 0.1 ms REPL.REPLCompletions.__init__()
               ├ 0.0 ms REPL.TerminalMenus.__init__()
               ├ 0.0 ms REPL.__init__()
     99.4 ms  REPL
  0.165016 seconds (219.49 k allocations: 13.188 MiB)
% julia +1.11  --startup-file=no -e "using InteractiveUtils; @time_imports @time using REPL"
     30.9 ms  StyledStrings
      0.4 ms  Unicode
               ┌ 0.0 ms REPL.REPLCompletions.__init__()
               ├ 0.0 ms REPL.TerminalMenus.__init__()
               ├ 0.0 ms REPL.__init__()
     82.3 ms  REPL
  0.126568 seconds (219.49 k allocations: 13.190 MiB)
% julia +1.11  --startup-file=no -e "using InteractiveUtils; @time using REPL"
  0.147592 seconds (218.79 k allocations: 13.145 MiB)
IanButterworth commented 2 days ago

Also just noting that on nightly you can now time all stdlib loads.

% julia +nightly  --startup-file=no -e "Base.@time_imports @time using REPL"
     35.1 ms  StyledStrings
               ┌ 0.0 ms JuliaSyntaxHighlighting.__init__()
     10.4 ms  JuliaSyntaxHighlighting
      0.6 ms  Base64
               ┌ 0.0 ms Markdown.__init__()
      7.8 ms  Markdown
      2.2 ms  InteractiveUtils
      0.3 ms  Unicode
               ┌ 0.0 ms REPL.REPLCompletions.__init__()
               ├ 0.0 ms REPL.__init__()
     86.6 ms  REPL
  0.154517 seconds (286.40 k allocations: 16.344 MiB)

And all the trace stuff

% julia +nightly  --startup-file=no -e "Base.@time_imports Base.@trace_compile @time using REPL"
KristofferC commented 2 days ago

I'm ok on Mac, on linux I do get the weird behavior:

Linux

$ julia +nightly --trace-compile=linux --startup-file=no -e "Base.@time_imports @time using REPL"
     59.8 ms  StyledStrings
               ┌ 0.1 ms JuliaSyntaxHighlighting.__init__() 
    151.9 ms  JuliaSyntaxHighlighting 88.02% compilation time
      1.0 ms  Base64
               ┌ 0.0 ms Markdown.__init__() 
     13.3 ms  Markdown
      2.6 ms  InteractiveUtils
      0.2 ms  Unicode
               ┌ 0.0 ms REPL.REPLCompletions.__init__() 
               ├ 0.1 ms REPL.__init__() 
    128.1 ms  REPL
  0.417726 seconds (639.01 k allocations: 35.355 MiB, 40.42% compilation time: 2% of which was recompilation)

$ cat linux
precompile(Tuple{typeof(Base.print), Base.TTY, String, String})
precompile(Tuple{typeof(Base.print), Base.TTY, String}) # recompile
precompile(Tuple{typeof(Base.println), Base.TTY})
precompile(Tuple{typeof(Core.kwcall), NamedTuple{(:bold, :italic, :underline, :blink, :reverse, :hidden, :color), Tuple{Bool, Bool, Bool, Bool, Bool, Bool, Symbol}}, typeof(Base.printstyled), Base.TTY, String})
precompile(Tuple{Type{Base.GC_Diff}, Base.GC_Num, Base.GC_Num})
precompile(Tuple{typeof(Base.getproperty), Base.GC_Diff, Symbol})
precompile(Tuple{Type{NamedTuple{(:value, :time, :bytes, :gctime, :gcstats, :lock_conflicts, :compile_time, :recompile_time), T} where T<:Tuple}, Tuple{Nothing, Float64, Int64, Float64, Base.GC_Diff, Int64, Float64, Float64}})
precompile(Tuple{typeof(Base.getproperty), NamedTuple{(:value, :time, :bytes, :gctime, :gcstats, :lock_conflicts, :compile_time, :recompile_time), Tuple{Nothing, Float64, Int64, Float64, Base.GC_Diff, Int64, Float64, Float64}}, Symbol})
precompile(Tuple{typeof(Base.:(*)), Float64, Float64}) # recompile
precompile(Tuple{typeof(Base.gc_alloc_count), Base.GC_Diff})
precompile(Tuple{typeof(Core.kwcall), NamedTuple{(:msg,), Tuple{Nothing}}, typeof(Base.time_print), Base.TTY, Float64, Int64, Int64, Int64, Int64, Float64, Float64, Bool})
precompile(Tuple{Base.var"#888#889"{Nothing, Float64, Int64, Int64, Int64, Float64, Float64, Bool, String}, Base.GenericIOBuffer{Memory{UInt8}}}) # recompile
precompile(Tuple{typeof(Base.prettyprint_getunits), Int64, Int64, Int64}) # recompile
precompile(Tuple{typeof(Base.Ryu.writefixed), Float64, Int64}) # recompile

Mac

❯ julia +nightly --trace-compile=mac --startup-file=no -e "Base.@time_imports @time using REPL"
     41.4 ms  StyledStrings
               ┌ 0.0 ms JuliaSyntaxHighlighting.__init__() 
     13.7 ms  JuliaSyntaxHighlighting
      2.4 ms  Base64
               ┌ 0.0 ms Markdown.__init__() 
     17.4 ms  Markdown
      7.0 ms  InteractiveUtils
      1.2 ms  Unicode
               ┌ 0.0 ms REPL.REPLCompletions.__init__() 
               ├ 0.1 ms REPL.__init__() 
    130.0 ms  REPL
  0.233130 seconds (280.76 k allocations: 16.155 MiB)

~
❯ cat mac          
precompile(Tuple{Type{Base.GC_Diff}, Base.GC_Num, Base.GC_Num})
precompile(Tuple{typeof(Base.getproperty), Base.GC_Diff, Symbol})
precompile(Tuple{Type{NamedTuple{(:value, :time, :bytes, :gctime, :gcstats, :lock_conflicts, :compile_time, :recompile_time), T} where T<:Tuple}, Tuple{Nothing, Float64, Int64, Float64, Base.GC_Diff, Int64, Float64, Float64}})
precompile(Tuple{typeof(Base.getproperty), NamedTuple{(:value, :time, :bytes, :gctime, :gcstats, :lock_conflicts, :compile_time, :recompile_time), Tuple{Nothing, Float64, Int64, Float64, Base.GC_Diff, Int64, Float64, Float64}}, Symbol})
precompile(Tuple{typeof(Base.:(*)), Float64, Float64}) # recompile
precompile(Tuple{typeof(Base.gc_alloc_count), Base.GC_Diff})
precompile(Tuple{typeof(Core.kwcall), NamedTuple{(:msg,), Tuple{Nothing}}, typeof(Base.time_print), Base.TTY, Float64, Int64, Int64, Int64, Int64, Float64, Float64, Bool})
precompile(Tuple{Base.var"#888#889"{Nothing, Float64, Int64, Int64, Int64, Float64, Float64, Bool, String}, Base.GenericIOBuffer{Memory{UInt8}}}) # recompile
precompile(Tuple{typeof(Base.prettyprint_getunits), Int64, Int64, Int64}) # recompile
precompile(Tuple{typeof(Base.Ryu.writefixed), Float64, Int64}) # recompile
IanButterworth commented 2 days ago

If you use the @trace_compile macro inside the @time it should remove all those precompiles in your mac example, as I believe they come from @time.

% julia +nightly --startup-file=no -e "Base.@time_imports @time Base.@trace_compile using REPL"
     58.6 ms  StyledStrings
               ┌ 0.1 ms JuliaSyntaxHighlighting.__init__()
     28.2 ms  JuliaSyntaxHighlighting
     13.2 ms  Base64
               ┌ 0.0 ms Markdown.__init__()
     22.4 ms  Markdown
     14.7 ms  InteractiveUtils
     15.4 ms  Unicode
               ┌ 0.0 ms REPL.REPLCompletions.__init__()
               ├ 0.1 ms REPL.__init__()
    117.6 ms  REPL
  0.325058 seconds (286.40 k allocations: 16.343 MiB)
%
KristofferC commented 2 days ago

On mac nothing, on linux there is some interleaved results. But the print + println and printstyled are there.

$ julia +nightly --startup-file=no -e "Base.@time_imports @time Base.@trace_compile using REPL"
#=   11.6 ms =# precompile(Tuple{typeof(Base.print), Base.TTY, String, String})
     59.3 ms  #=    2.6 ms =# precompile(Tuple{typeof(Base.print), Base.TTY, String}) # recompile
StyledStrings#=   15.7 ms =# precompile(Tuple{typeof(Base.println), Base.TTY})
#=  101.0 ms =# precompile(Tuple{typeof(Core.kwcall), NamedTuple{(:bold, :italic, :underline, :blink, :reverse, :hidden, :color), Tuple{Bool, Bool, Bool, Bool, Bool, Bool, Symbol}}, typeof(Base.printstyled), Base.TTY, String})
               ┌ 0.1 ms JuliaSyntaxHighlighting.__init__() 
    151.1 ms  JuliaSyntaxHighlighting 87.88% compilation time
      1.0 ms  Base64
               ┌ 0.0 ms Markdown.__init__() 
     13.4 ms  Markdown
      2.6 ms  InteractiveUtils
      0.2 ms  Unicode
               ┌ 0.0 ms REPL.REPLCompletions.__init__() 
               ├ 0.1 ms REPL.__init__() 
    127.2 ms  REPL
  0.412694 seconds (639.01 k allocations: 35.351 MiB, 40.41% compilation time: 2% of which was recompilation)