JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.43k stars 5.45k forks source link

Inference regression in 1.11 #55230

Closed Sbozzolo closed 1 month ago

Sbozzolo commented 1 month ago

We are seeing massive latency increases in ClimaAtmos.jl with Julia 1.11, starting from the alphas and continuing in 1.11-rc1 (https://github.com/CliMA/ClimaAtmos.jl/issues/3186).

ClimaAtmos.jl does no longer compile in any reasonable time with Julia 1.11-rc1. I identified that the offending function is get_atmos, a simple function that returns a keyword-defined type. For the most part, this function simply converts string keywords read from YAML files to types, then, it returns a AtmosModel (defined below).

I used SnoopCompile (master) to try to get a little more insight. The get_atmos function, as it is, takes too long to compile (on Julia 1.10, this is instantaneous), so I had to remove part of it (as mentioned in the referenced issue). Once I do so, I see with @snoop_inference that:

InferenceTimingNode: 1.171199/288.711156 on Core.Compiler.Timings.ROOT() with 21 direct children

Plotting the inference: image

All the time is spent in types.jl line 329. This line simply defines our struct:

Base.@kwdef struct AtmosModel{
    MC,
    MM,
    PM,
    CM,
    CCDPS,
    F,
    S,
    RM,
    LA,
    EXTFORCING,
    EC,
    AT,
    TM,
    EEM,
    EDM,
    ESMF,
    ESDF,
    ENP,
    EVR,
    TCM,
    NOGW,
    OGW,
    HD,
    VD,
    DM,
    SAM,
    VS,
    RS,
    ST,
    IN,
    SM,
    SA,
    NUM,
}
    model_config::MC = nothing
    moisture_model::MM = nothing
    precip_model::PM = nothing
    cloud_model::CM = nothing
    call_cloud_diagnostics_per_stage::CCDPS = nothing
    forcing_type::F = nothing
    subsidence::S = nothing
    radiation_mode::RM = nothing
    ls_adv::LA = nothing
    external_forcing::EXTFORCING = nothing
    edmf_coriolis::EC = nothing
    advection_test::AT = nothing
    tendency_model::TM = nothing
    edmfx_entr_model::EEM = nothing
    edmfx_detr_model::EDM = nothing
    edmfx_sgs_mass_flux::ESMF = nothing
    edmfx_sgs_diffusive_flux::ESDF = nothing
    edmfx_nh_pressure::ENP = nothing
    edmfx_filter::EVR = nothing
    turbconv_model::TCM = nothing
    non_orographic_gravity_wave::NOGW = nothing
    orographic_gravity_wave::OGW = nothing
    hyperdiff::HD = nothing
    vert_diff::VD = nothing
    diff_mode::DM = nothing
    sgs_adv_mode::SAM = nothing
    viscous_sponge::VS = nothing
    rayleigh_sponge::RS = nothing
    sfc_temperature::ST = nothing
    insolation::IN = nothing
    surface_model::SM = nothing
    surface_albedo::SA = nothing
    numerics::NUM = nothing
end

I have not tried without keyword arguments.

I don't think that what we are doing up to this point is particularly complex or unorthodox, and most of the types involved are very simple (mostly singletons and bools, all immutable).

Originally posted by @Sbozzolo in https://github.com/JuliaLang/julia/issues/55171#issuecomment-2247094348

KristofferC commented 1 month ago

Doing the very scientific thing of Ctrl-C after some time it seems to be stuck in subtyping:


Internal error: during type inference of
get_atmos(ClimaAtmos.AtmosConfig{Float32, ClimaParams.ParamDict{Float32}, Base.Dict{String, Any}, ClimaComms.SingletonCommsContext{ClimaComms.CPUSingleThreaded}, Tuple{String}}, ClimaAtmos.Parameters.ClimaAtmosParameters{Float32, Thermodynamics.Parameters.ThermodynamicsParameters{Float32}, RRTMGP.Parameters.RRTMGPParameters{Float32}, Insolation.Parameters.InsolationParameters{Float32}, Nothing, Nothing, CloudMicrophysics.Parameters.WaterProperties{Float32}, SurfaceFluxes.Parameters.SurfaceFluxesParameters{Float32, SurfaceFluxes.UniversalFunctions.BusingerParams{Float32}, Thermodynamics.Parameters.ThermodynamicsParameters{Float32}}, ClimaAtmos.Parameters.TurbulenceConvectionParameters{Float32}, ClimaAtmos.Parameters.SurfaceTemperatureParameters{Float32}})
Encountered unexpected error in runtime:
InterruptException()
subtype at julia/src/subtype.c:1407
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
subtype_unionall at julia/src/subtype.c:925
exists_subtype at julia/src/subtype.c:1651 [inlined]
_forall_exists_subtype at julia/src/subtype.c:1682
forall_exists_subtype at julia/src/subtype.c:1696 [inlined]
ijl_subtype_env at julia/src/subtype.c:2146
jl_type_intersection_env_s at julia/src/subtype.c:4405
jl_typemap_intersection_node_visitor at julia/src/typemap.c:543 [inlined]
jl_typemap_intersection_visitor at julia/src/typemap.c:812
jl_typemap_intersection_visitor at julia/src/typemap.c:770```
KristofferC commented 1 month ago

I think

using ClimaAtmos
using AtmosphericProfilesLibrary 
using Interpolations
T1 = Tuple{Type{ClimaAtmos.AtmosModel{MC, MM, PM, CM, CCDPS, F, S, RM, LA, EXTFORCING, EC, AT, TM, EEM, EDM, ESMF, ESDF, ENP, EVR, TCM, NOGW, OGW, HD, VD, DM, SAM, VS, RS, ST, IN, SM, SA, NUM} where NUM where SA where SM where IN where ST where RS where VS where SAM where DM where VD where HD where OGW where NOGW where TCM where EVR where ENP where ESDF where ESMF where EDM where EEM where TM where AT where EC where EXTFORCING where LA where RM where S where F where CCDPS where CM where PM where MM where MC}, Union{ClimaAtmos.BoxModel, ClimaAtmos.PlaneModel, ClimaAtmos.SingleColumnModel, ClimaAtmos.SphericalModel, Nothing}, Union{ClimaAtmos.DryModel, ClimaAtmos.EquilMoistModel, ClimaAtmos.NonEquilMoistModel, Nothing}, Union{ClimaAtmos.Microphysics0Moment, ClimaAtmos.Microphysics1Moment, ClimaAtmos.NoPrecipitation}, Union{ClimaAtmos.DiagnosticEDMFCloud, ClimaAtmos.GridScaleCloud, ClimaAtmos.QuadratureCloud}, Union{ClimaAtmos.CallCloudDiagnosticsPerStage, Nothing}, Union{ClimaAtmos.HeldSuarezForcing, Nothing}, Union{Nothing, ClimaAtmos.Subsidence{T} where T}, Union{Nothing, ClimaAtmos.RadiationDYCOMS{Float32}, ClimaAtmos.RRTMGPInterface.AllSkyRadiation, ClimaAtmos.RRTMGPInterface.AllSkyRadiationWithClearSkyDiagnostics, ClimaAtmos.RRTMGPInterface.ClearSkyRadiation, ClimaAtmos.RRTMGPInterface.GrayRadiation, ClimaAtmos.RadiationTRMM_LBA{AtmosphericProfilesLibrary.TimeZProfile{Interpolations.Extrapolation{Float32, 2, Interpolations.GriddedInterpolation{Float32, 2, Array{Float32, 2}, Tuple{Interpolations.Gridded{Interpolations.Linear{Interpolations.Throw{Interpolations.OnGrid}}}, Interpolations.Gridded{Interpolations.Linear{Interpolations.Throw{Interpolations.OnGrid}}}}, Tuple{Base.StepRangeLen{Float32, Float64, Float64, Int64}, Array{Float32, 1}}}, Tuple{Interpolations.Gridded{Interpolations.Linear{Interpolations.Throw{Interpolations.OnGrid}}}, Interpolations.Gridded{Interpolations.Linear{Interpolations.Throw{Interpolations.OnGrid}}}}, Interpolations.Flat{Nothing}}}}}, Union{Nothing, ClimaAtmos.LargeScaleAdvection{_A, _B} where _B where _A}, Union{Nothing, ClimaAtmos.GCMForcing{Float64}}, Union{Nothing, ClimaAtmos.EDMFCoriolis{_A, _B, _C} where _C where _B where _A}, Any, Union{ClimaAtmos.NoGridScaleTendency, ClimaAtmos.NoSubgridScaleTendency, ClimaAtmos.UseAllTendency, Nothing}, Union{ClimaAtmos.GeneralizedEntrainment, ClimaAtmos.GeneralizedHarmonicsEntrainment, ClimaAtmos.NoEntrainment, ClimaAtmos.PiGroupsEntrainment}, Union{ClimaAtmos.ConstantAreaDetrainment, ClimaAtmos.GeneralizedDetrainment, ClimaAtmos.GeneralizedHarmonicsDetrainment, ClimaAtmos.NoDetrainment, ClimaAtmos.PiGroupsDetrainment}, Any, Any, Any, Any, Union{Nothing, ClimaAtmos.DiagnosticEDMFX{_A, _B, Float32} where _B where _A, ClimaAtmos.PrognosticEDMFX{_A, _B, Float32} where _B where _A}, Union{Nothing, ClimaAtmos.NonOrographyGravityWave{Float32}}, Union{Nothing, ClimaAtmos.OrographicGravityWave{Float32, String}}, Union{Nothing, ClimaAtmos.ClimaHyperdiffusion{_A} where _A}, Union{Nothing, ClimaAtmos.FriersonDiffusion{_A, Float32} where _A, ClimaAtmos.VerticalDiffusion{_A, Float32} where _A}, Union{ClimaAtmos.Explicit, ClimaAtmos.Implicit}, Union{ClimaAtmos.Explicit, ClimaAtmos.Implicit}, Union{Nothing, ClimaAtmos.ViscousSponge{Float32}}, Union{Nothing, ClimaAtmos.RayleighSponge{Float32}}, Union{ClimaAtmos.RCEMIPIISST, ClimaAtmos.ZonallyAsymmetricSST, ClimaAtmos.ZonallySymmetricSST, Nothing}, Union{ClimaAtmos.IdealizedInsolation, ClimaAtmos.RCEMIPIIInsolation, ClimaAtmos.TimeVaryingInsolation, Nothing}, Union{ClimaAtmos.PrescribedSurfaceTemperature, ClimaAtmos.PrognosticSurfaceTemperature{Int64}}, Union{ClimaAtmos.CouplerAlbedo, ClimaAtmos.ConstantAlbedo{Float32}, ClimaAtmos.RegressionFunctionAlbedo{Float32, ClimaAtmos.var"#134#136"{Float32}}}, ClimaAtmos.AtmosNumerics{EN_UP, TR_UP, ED_UP, ED_SG_UP, DYCORE, LIM} where LIM where DYCORE where ED_SG_UP where ED_UP where TR_UP where EN_UP}

T2 = Tuple{Type{ClimaAtmos.AtmosModel{MC, MM, PM, CM, CCDPS, F, S, RM, LA, EXTFORCING, EC, AT, TM, EEM, EDM, ESMF, ESDF, ENP, EVR, TCM, NOGW, OGW, HD, VD, DM, SAM, VS, RS, ST, IN, SM, SA, NUM} where NUM where SA where SM where IN where ST where RS where VS where SAM where DM where VD where HD where OGW where NOGW where TCM where EVR where ENP where ESDF where ESMF where EDM where EEM where TM where AT where EC where EXTFORCING where LA where RM where S where F where CCDPS where CM where PM where MM where MC}, MC, MM, PM, CM, CCDPS, F, S, RM, LA, EXTFORCING, EC, AT, TM, EEM, EDM, ESMF, ESDF, ENP, EVR, TCM, NOGW, OGW, HD, VD, DM, SAM, VS, RS, ST, IN, SM, SA, NUM} where NUM where SA where SM where IN where ST where RS where VS where SAM where DM where VD where HD where OGW where NOGW where TCM where EVR where ENP where ESDF where ESMF where EDM where EEM where TM where AT where EC where EXTFORCING where LA where RM where S where F where CCDPS where CM where PM where MM where MC

@time T1 <: T2

is a reproducer

KristofferC commented 1 month ago

For me, ClimaAtmos precompiles for a very long time on 1.10 though so I cannot really check if this subtype check is fast there...

Sbozzolo commented 1 month ago

For me, ClimaAtmos precompiles for a very long time on 1.10 though so I cannot really check if this subtype check is fast there...

Ah yeah, interesting! I just tried your reproducer on 1.10.4 and found that indeed the subtype check is slow there too.

However, if you look at our CI, you'll see that we can run without problems on 1.10.4 but not on 1.11-rc1.

This still reproduces the problem, so there's probably something else going on.

import ClimaAtmos

config = ClimaAtmos.AtmosConfig()
params = ClimaAtmos.create_parameter_set(config)
@time ClimaAtmos.get_atmos(config, params)

For 1.10.4: 3.481628 seconds (12.20 M allocations: 533.572 MiB, 2.93% gc time, 99.90% compilation time)

KristofferC commented 1 month ago

However, if you look at our CI, you'll see that we can run without problems on 1.10.4 but not on 1.11-rc1.

Not sure, locally on my M1 it was precompiling for ever but seems to work ok when SSHing to a different system.

Ah yeah, interesting! I just tried your reproducer on 1.10.4 and found that indeed the subtype check is slow there too.

Okay, might be some other change that causes this subtyping query to take place on 1.11 but not on 1.10 then.

KristofferC commented 1 month ago

I started a bisect on 1.11 branch. It's a bit annoying to automate so I just do it manually. Will finish it tomorrow or tonight unless someone beats it to me:

git bisect start
# good: [0ba6ec2d2282937a084d7e5e5a0b026dc953bb31] Restore link to list of packages in Base docs (#50353)
git bisect good 0ba6ec2d2282937a084d7e5e5a0b026dc953bb31
# bad: [46e4740815da6cf99455ae80a12619f299fafeeb] [1.11 backport] trace-compile: don't generate `precompile` statements for OpaqueClosure methods (#55072) (#55225)
git bisect bad 46e4740815da6cf99455ae80a12619f299fafeeb
# bad: [959b474d0516df77a268d9f23ccda5d2ad32acdf] docs: update latest stable version (#52215)
git bisect bad 959b474d0516df77a268d9f23ccda5d2ad32acdf
# good: [a4b5ad3d4c6c56056ac1fd55d012ca2a6d234d35] Update default libgit2 version to 1.7.1
git bisect good a4b5ad3d4c6c56056ac1fd55d012ca2a6d234d35
# bad: [994719386dcd516685561443fe6e7f4d3ce60cd0] Converge TaggedString terminology: use annotations
git bisect bad 994719386dcd516685561443fe6e7f4d3ce60cd0
# bad: [64fc7db055604a8858e38e18bd81ff14f8f45101] Add Libc.mkfifo (#34587)
git bisect bad 64fc7db055604a8858e38e18bd81ff14f8f45101
KristofferC commented 1 month ago

Bisected to a61d1b47a68297704814188a3509c011ec2a8fa1 (#50927)

commit a61d1b47a68297704814188a3509c011ec2a8fa1
Author: Jameson Nash <vtjnash@gmail.com>
Date:   Fri Sep 15 18:32:16 2023 -0400

    inference: apply tmerge limit elementwise to the Union (#50927)

    This allows forming larger unions, as long as each element in the Union
    is both relatively distinct and relatively simple. For example:

        tmerge(Base.BitSigned, Nothing) == Union{Nothing, Int128, Int16, Int32, Int64, Int8}
        tmerge(Tuple{Base.BitSigned, Int}, Nothing) == Union{Nothing, Tuple{Any, Int64}}
        tmerge(AbstractVector{Int}, Vector) == AbstractVector

    Disables a test from dc8d885, which does not seem possible to handle currently.

    This makes somewhat drastic changes to make this algorithm more
    commutative and simpler, since we dropped the final widening to `Any`.

    Co-authored-by: pchintalapudi <34727397+pchintalapudi@users.noreply.github.com>
    Co-authored-by: Oscar Smith <oscardssmith@gmail.com>

cc @vtjnash

oscardssmith commented 1 month ago

That's unfortunate. Hopefully the subtyping precision improvements are savable

KristofferC commented 1 month ago

The subtyping query in https://github.com/JuliaLang/julia/issues/55230#issuecomment-2247721420 could perhaps be made faster?

gbaraldi commented 1 month ago

With no packages

julia> T1 = Tuple{Union{Val{1}, Val{2}, Val{3}, Val{4}, Nothing},
       Union{Val{1}, Val{2}, Val{3}, Nothing},
       Union{Val{1}, Val{2}, Val{3}, Nothing},
       Union{Val{1}, Val{2}, Val{3}, Nothing},
       Union{Val{1}, Nothing},
       Union{Val{1}, Nothing},
       Union{Nothing, Val{1}},
       Union{Val{1}, Val{2}, Val{3}, Val{4}, Val{5}, Val{6}, Val{7}, Nothing},
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}, Nothing},
       Any,
       Union{Val{1}, Val{2}, Val{3}, Nothing},
       Union{Val{1}, Val{2}, Val{3}, Val{4}},
       Union{Val{1}, Val{2}, Val{3}, Val{4}, Val{5}},
       Any,
       Any,
       Any,
       Any,
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}},
       Union{Val{1}, Val{2}},
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}, Val{3}, Nothing},
       Union{Val{1}, Val{2}, Val{3}, Nothing},
       Union{Val{1}, Val{2}},
       Union{Val{1}, Val{2}, Val{3}},
       Val{1}}

julia> T2 = Tuple{MC, MM, PM, CM, CCDPS, F, S, RM, LA, EXTFORCING, EC, AT, TM, EEM, EDM, ESMF, ESDF, ENP, EVR, TCM, NOGW, OGW, HD, VD, DM, SAM, VS, RS, ST, IN, SM, SA, NUM} where NUM where SA where SM where IN where ST where RS where VS where SAM where DM where VD where HD where OGW where NOGW where TCM where EVR where ENP where ESDF where ESMF where EDM where EEM where TM where AT where EC where EXTFORCING where LA where RM where S where F where CCDPS where CM where PM where MM where MC

@time T1 <: T2
vtjnash commented 1 month ago

somewhat more simplified:

julia> T1 = Tuple{Union{Val{1}, Val{2}, Val{3}, Val{4}, Nothing},
       Union{Val{1}, Val{2}, Val{3}, Nothing},
       Union{Val{1}, Val{2}, Val{3}, Nothing},
       Union{Val{1}, Val{2}, Val{3}, Nothing},
       Union{Val{1}, Nothing},
       Union{Val{1}, Nothing},
       Union{Nothing, Val{1}},
       Union{Val{1}, Val{2}, Val{3}, Val{4}, Val{5}, Val{6}, Val{7}, Nothing},
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}, Nothing},
       Union{Val{1}, Val{2}, Val{3}}}
Tuple{Union{Val{1}, Val{2}, Val{3}, Val{4}, Nothing}, Union{Val{1}, Val{2}, Val{3}, Nothing}, Union{Val{1}, Val{2}, Val{3}, Nothing}, Union{Val{1}, Val{2}, Val{3}, Nothing}, Union{Val{1}, Nothing}, Union{Val{1}, Nothing}, Union{Val{1}, Nothing}, Union{Val{1}, Val{2}, Val{3}, Val{4}, Val{5}, Val{6}, Val{7}, Nothing}, Union{Val{1}, Val{2}, Nothing}, Union{Val{1}, Val{2}, Nothing}, Union{Val{1}, Val{2}, Nothing}, Union{Val{1}, Val{2}, Val{3}}}

julia> T2 = Tuple{<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any};

julia> @time T1 <: T2
  2.016903 seconds
true
oscardssmith commented 1 month ago

Much simpler:

julia> T1 = NTuple{12, Union{Val{1}, Val{2}, Val{3}, Val{4}}}
julia> T2 = Tuple{<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any,<:Any}
julia> @time T1 <: T2
 19.878233 seconds