SpeedyWeather / SpeedyWeather.jl

Play atmospheric modelling like it's LEGO.
https://speedyweather.github.io/SpeedyWeather.jl/dev
MIT License
400 stars 24 forks source link

PrimitiveDry and WetModel generation on Julia v1.9 hangs #508

Closed FredericWantiez closed 2 months ago

FredericWantiez commented 2 months ago

Hello,

I've been trying to run some toy models but julia gets stuck on the model creation step. For something like the intro model:

using SpeedyWeather

spectral_grid = SpectralGrid(trunc=16, Grid=OctahedralGaussianGrid, nlev=8)
model = PrimitiveWetModel(; spectral_grid)
simulation = initialize!(model)
run!(simulation, period=Day(1), output=true)

the process never gets past model = PrimitiveWetModel(...). When I kill the process manually, pages of the following error are displayed:

intersect at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect_unionall_ at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect_unionall at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect_unionall_ at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect_unionall at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect_unionall_ at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect_unionall at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect_unionall_ at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect_unionall at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect_unionall_ at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect_unionall at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect_unionall_ at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect_unionall at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect_unionall_ at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect_unionall at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect_unionall_ at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect_unionall at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect_unionall_ at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect_unionall at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)
intersect at /Users/frederic/.julia/juliaup/julia-1.9.4+0.aarch64.apple.darwin14/lib/julia/libjulia-internal.1.9.dylib (unknown line)

Some details on my setup, running on a M1 Max:

Julia Version 1.9.4
Commit 8e5136fa297 (2023-11-14 08:46 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  CPU: 10 × Apple M1 Max
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
  Threads: 1 on 8 virtual cores

Everything runs smoothly on the release version: julia-1.10

milankl commented 2 months ago

Weird, this doesn't look like the error originates from within SpeedyWeather.jl though. Could you check whether

FredericWantiez commented 2 months ago

It's hard to debug, but seems like julia gets lost in the type inference, this doesn't even return:

julia +1.9 --project -e "using SpeedyWeather; PrimitiveDryModel()"
milankl commented 2 months ago

When creating a PrimitiveDryModel the following steps are executed, can you copy & paste them in?

spectral_grid = SpectralGrid()
geometry = Geometry(spectral_grid)

# DYNAMICS
dynamics = true
planet = Earth(spectral_grid)
atmosphere = EarthAtmosphere(spectral_grid)
coriolis = Coriolis(spectral_grid)
geopotential = Geopotential(spectral_grid)
adiabatic_conversion = AdiabaticConversion(spectral_grid)
particle_advection = NoParticleAdvection()
initial_conditions = InitialConditions(PrimitiveDry)

# BOUNDARY CONDITIONS
orography = EarthOrography(spectral_grid)
land_sea_mask = LandSeaMask(spectral_grid)
ocean = SeasonalOceanClimatology(spectral_grid)
land = SeasonalLandTemperature(spectral_grid)
solar_zenith = WhichZenith(spectral_grid, planet)
albedo = AlbedoClimatology(spectral_grid)

# PHYSICS/PARAMETERIZATIONS
physics = true
boundary_layer_drag = BulkRichardsonDrag(spectral_grid)
temperature_relaxation = NoTemperatureRelaxation(spectral_grid)
vertical_diffusion = BulkRichardsonDiffusion(spectral_grid)
surface_thermodynamics = SurfaceThermodynamicsConstant(spectral_grid)
surface_wind = SurfaceWind(spectral_grid)
surface_heat_flux = SurfaceSensibleHeat(spectral_grid)
convection = DryBettsMiller(spectral_grid)
shortwave_radiation = TransparentShortwave(spectral_grid)
longwave_radiation = JeevanjeeRadiation(spectral_grid)

# NUMERICS
device_setup = DeviceSetup(CPUDevice())
time_stepping = Leapfrog(spectral_grid)
spectral_transform = SpectralTransform(spectral_grid)
implicit = ImplicitPrimitiveEquation(spectral_grid)
horizontal_diffusion = HyperDiffusion(spectral_grid)
vertical_advection = CenteredVerticalAdvection(spectral_grid)

# OUTPUT
output = OutputWriter(spectral_grid, PrimitiveDry)
callbacks = Dict{Symbol, AbstractCallback}()
feedback = Feedback()

(some are not exported, if you hit a UndefVarError just add SpeedWeather. in front of it. Depending on your version (this is from main) if you still hit an UndefVarError then just skip it, some fields were introduced after the latest release)

FredericWantiez commented 2 months ago

Tried something similar earlier, it all runs fine if I run each of the steps independently. Happens on both the tip of main and the latest release.

milankl commented 2 months ago

Then it must be the wrapping into PrimitiveDryModel, because technically all of these are executed and then PrimtivieDryModel(spectral_grid, geometry, dynamics, ...). A PrimitiveDryModel has a lot of parameters (in the Julia sense, i.e. PrimitiveDryModel{A,B,C,...}) which are inferred at that stage. Maybe that's what the intersect_unionall is referring to. On the other hand that logic is the same as for BarotropicModel, so I'm wondering whether there's just something in 1.9 that's causing the problem.

Thanks for flagging this, but also happy to hear it's not an issue on 1.10, is it a problem for you to use that version?

FredericWantiez commented 2 months ago

That's also my guess, something gets lost when infering the Primitive type. I'll use 1.10 for my experiments, mostly wanted to document the weird behavior somewhere. Thanks for looking into it

milankl commented 2 months ago

@giordano the creation of large parametric structs seems to hang on 1.9 and apple M1/2 👆 but not on 1.10, you think this could be related to your aarch64-related fixes for 1.10? Both M1/2 are armv8 right?

giordano commented 2 months ago

you think this could be related to your aarch64-related fixes for 1.10?

If you're referring to https://julialang.org/blog/2023/12/julia-1.10-highlights/#linux_aarch64_stability_improvements, that was specific to aarch64-linux, as the JITLink linker was already used on aarch64-darwin, the PR mentioned there only switched that on also on Linux. So I don't think that was it.

Both M1/2 are armv8 right?

Yes, but aarch64 is a slightly more accurate name for the architecture

milankl commented 2 months ago

@FredericWantiez I'm closing this as I see no reason not to use 1.10 (?) and there doesn't seem to be anything we can do here. Feel free to reopen anyone if that shouldn't be the case

milankl commented 2 months ago

I think I found the culprit on v1.9, if I remove the type parameter constraints, i.e.

Base.@kwdef mutable struct PrimitiveWetModel{
    DS,
    GE,
    PL,
    AT,
    CO,
    GO,
    OR,
    AC,
    PA,
    IC,
    LS,
    OC,
    LA,
    ZE,
    AL,
    SO,
    VG,
    CC,
    BL,
    TR,
    VD,
    SUT,
    SUW,
    SH,
    EV,
    LSC,
    CV,
    SW,
    LW,
    TS,
    ST,
    IM,
    HD,
    VA,
    HF,
    OW,
    FB,
} <: PrimitiveWet

instead of

Base.@kwdef mutable struct PrimitiveWetModel{
    NF<:AbstractFloat,
    DS<:DeviceSetup,
    PL<:AbstractPlanet,
    AT<:AbstractAtmosphere,
    CO<:AbstractCoriolis,
    GO<:AbstractGeopotential,
    OR<:AbstractOrography,
    AC<:AbstractAdiabaticConversion,
    PA<:AbstractParticleAdvection,
    IC<:AbstractInitialConditions,
    LS<:AbstractLandSeaMask,
    OC<:AbstractOcean,
    LA<:AbstractLand,
    ZE<:AbstractZenith,
    AL<:AbstractAlbedo,
    SO<:AbstractSoil,
    VG<:AbstractVegetation,
    CC<:AbstractClausiusClapeyron,
    BL<:AbstractBoundaryLayer,
    TR<:AbstractTemperatureRelaxation,
    VD<:AbstractVerticalDiffusion,
    SUT<:AbstractSurfaceThermodynamics,
    SUW<:AbstractSurfaceWind,
    SH<:AbstractSurfaceHeatFlux,
    EV<:AbstractSurfaceEvaporation,
    LSC<:AbstractCondensation,
    CV<:AbstractConvection,
    SW<:AbstractShortwave,
    LW<:AbstractLongwave,
    TS<:AbstractTimeStepper,
    ST<:SpectralTransform{NF},
    IM<:AbstractImplicit,
    HD<:AbstractHorizontalDiffusion,
    VA<:AbstractVerticalAdvection,
    HF<:AbstractHoleFilling,
    GE<:AbstractGeometry,
    OW<:AbstractOutputWriter,
    FB<:AbstractFeedback,
} <: PrimitiveWet

then it works. While this would allow you to create a model with, say, orography = SpectralTransform(...) which I wanted to avoid with the <: constraints, model initialization would probably already fail.