CliMA / Oceananigans.jl

🌊 Julia software for fast, friendly, flexible, ocean-flavored fluid dynamics on CPUs and GPUs
https://clima.github.io/OceananigansDocumentation/stable
MIT License
1k stars 196 forks source link

Are we actually `@unroll`ing when we think we are? #3374

Closed glwagner closed 9 months ago

glwagner commented 1 year ago

On Julia 1.10 users are met with an avalanche tsunami of warnings

warning: /Users/gregorywagner/.julia/packages/KernelAbstractions/WoCk1/src/extras/loopinfo.jl:28:0: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering

For example we try

https://github.com/CliMA/Oceananigans.jl/blob/7291ada057afc9cfcefb2b6e9351cff8782d9217/src/Solvers/batched_tridiagonal_solver.jl#L133-L148

but this loop probably can't be unrolled because Nx is a runtime value, not a compile time constant.

I don't know if we ever @unroll properly...

Seems like the easiest thing is just to stop pretending that we @unroll.

@jlk9

navidcy commented 1 year ago

(Indeed "tsunami" is more appropriate here rather than "avalanche".)

navidcy commented 11 months ago

Hm... removing all unrolls from solve_batched_tridiagonal_system_kernel didn't heal the warnings...

glwagner commented 11 months ago

aren't there more than that

glwagner commented 11 months ago

don't remove them all because some could be legit

navidcy commented 11 months ago

Sure, I only removed them just to see if they were the culprit.

glwagner commented 11 months ago

That was just an example. There's a lot of erroneous usage.

This might help:

(base) gregorywagner:src/ (glw/fix-adapt) $ grep -r unroll ./*                                            [11:11:55]
./Advection/Advection.jl:using KernelAbstractions.Extras.LoopInfo: @unroll
./Advection/stretched_weno_smoothness.jl:        @unroll for j = 1:3
./Advection/stretched_weno_smoothness.jl:        @unroll for j = 1:3
./BoundaryConditions/fill_halo_regions_open.jl:# and need to unroll a loop over the boundary normal direction.
./BoundaryConditions/fill_halo_regions.jl:using KernelAbstractions.Extras.LoopInfo: @unroll
./BoundaryConditions/fill_halo_regions_periodic.jl:using KernelAbstractions.Extras.LoopInfo: @unroll
./BoundaryConditions/fill_halo_regions_periodic.jl:    @unroll for i = 1:H
./BoundaryConditions/fill_halo_regions_periodic.jl:    @unroll for j = 1:H
./BoundaryConditions/fill_halo_regions_periodic.jl:    @unroll for k = 1:H
./BoundaryConditions/fill_halo_regions_periodic.jl:    @unroll for n = 1:M
./BoundaryConditions/fill_halo_regions_periodic.jl:        @unroll for i = 1:H
./BoundaryConditions/fill_halo_regions_periodic.jl:    @unroll for n = 1:M
./BoundaryConditions/fill_halo_regions_periodic.jl:        @unroll for j = 1:H
./BoundaryConditions/fill_halo_regions_periodic.jl:    @unroll for n = 1:M
./BoundaryConditions/fill_halo_regions_periodic.jl:        @unroll for k = 1:H
./BoundaryConditions/fill_halo_regions_flux.jl:using KernelAbstractions.Extras.LoopInfo: @unroll
./Fields/regridding_fields.jl:using KernelAbstractions.Extras.LoopInfo: @unroll
./Fields/regridding_fields.jl:    @inbounds @unroll for k = 1:target_grid.Nz
./Fields/regridding_fields.jl:            @unroll for k_src = kâ‚‹_src:kâ‚Š_src-1
./Fields/regridding_fields.jl:    @inbounds @unroll for j = 1:target_grid.Ny
./Fields/regridding_fields.jl:            @unroll for j_src = jâ‚‹_src:jâ‚Š_src-1
./Fields/regridding_fields.jl:    @inbounds @unroll for i = 1:target_grid.Nx
./Fields/regridding_fields.jl:            @unroll for i_src = iâ‚‹_src:iâ‚Š_src-1
./Fields/field_boundary_buffers.jl:using KernelAbstractions.Extras.LoopInfo: @unroll
./Models/NonhydrostaticModels/update_hydrostatic_pressure.jl:    @unroll for k in grid.Nz-1 : -1 : 1
./Models/NonhydrostaticModels/NonhydrostaticModels.jl:using KernelAbstractions.Extras.LoopInfo: @unroll
./Models/ShallowWaterModels/store_shallow_water_tendencies.jl:    @unroll for t in 1:3
./Models/ShallowWaterModels/ShallowWaterModels.jl:using KernelAbstractions.Extras.LoopInfo: @unroll
./Models/HydrostaticFreeSurfaceModels/HydrostaticFreeSurfaceModels.jl:using KernelAbstractions.Extras.LoopInfo: @unroll
./Models/HydrostaticFreeSurfaceModels/compute_w_from_continuity.jl:    @unroll for k in 2:grid.Nz+1
./Models/HydrostaticFreeSurfaceModels/split_explicit_free_surface_kernels.jl:using KernelAbstractions.Extras.LoopInfo: @unroll
./Models/HydrostaticFreeSurfaceModels/split_explicit_free_surface_kernels.jl:    # hand unroll first loop
./Models/HydrostaticFreeSurfaceModels/split_explicit_free_surface_kernels.jl:    @unroll for k in 2:grid.Nz
./Models/HydrostaticFreeSurfaceModels/split_explicit_free_surface_kernels.jl:    # hand unroll first loop
./Models/HydrostaticFreeSurfaceModels/split_explicit_free_surface_kernels.jl:    @unroll for k in 2:grid.Nz
./Solvers/batched_tridiagonal_solver.jl:        @unroll for i = 2:Nx
./Solvers/batched_tridiagonal_solver.jl:        @unroll for i = Nx-1:-1:1
./Solvers/batched_tridiagonal_solver.jl:        @unroll for j = 2:Ny
./Solvers/batched_tridiagonal_solver.jl:        @unroll for j = Ny-1:-1:1
./Solvers/batched_tridiagonal_solver.jl:        @unroll for k = 2:Nz
./Solvers/batched_tridiagonal_solver.jl:        @unroll for k = Nz-1:-1:1
./Solvers/Solvers.jl:using KernelAbstractions.Extras.LoopInfo: @unroll
./Solvers/fourier_tridiagonal_poisson_solver.jl:    @unroll for i in 2:Nx-1
./Solvers/fourier_tridiagonal_poisson_solver.jl:    @unroll for j in 2:Ny-1
./Solvers/fourier_tridiagonal_poisson_solver.jl:    @unroll for k in 2:Nz-1
glwagner commented 11 months ago

Any place where the loop limits are not types, it's wrong. It only works if the limits are known via types (so they are known at compile time rather than runtime). Typically this would require uusing Val{N} or Val{H} but even then it can fail sometimes.

navidcy commented 11 months ago

Any place where the loop limits are not types, it's wrong. It only works if the limits are known via types (so they are known at compile time rather than runtime). Typically this would require uusing Val{N} or Val{H} but even then it can fail sometimes.

I'm not sure I understand what you mean here. Can you give an example that's OK and one that's not? Also, @unroll comes from KernelAbstractions.Extras.LoopInfo.@unroll, right? The docstring is not really helping me on this:

help?> KernelAbstractions.Extras.LoopInfo.@unroll
  @unroll expr

  Takes a for loop as expr and informs the LLVM unroller to fully unroll it, if it is safe to do so and the loop count is known.

  ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

  @unroll N expr

  Takes a for loop as expr and informs the LLVM unroller to unroll it N times, if it is safe to do so.

In particular, I don't know what "if it is safe to do so" refers to.

navidcy commented 11 months ago

Any place where the loop limits are not types, it's wrong. It only works if the limits are known via types (so they are known at compile time rather than runtime). Typically this would require uusing Val{N} or Val{H} but even then it can fail sometimes.

Do you simply mean that

@unroll for j in 1:4; do_this(); end

is OK but

N=4
@unroll for j in 1:N; do_this(); end

is not?

glwagner commented 11 months ago

Any place where the loop limits are not types, it's wrong. It only works if the limits are known via types (so they are known at compile time rather than runtime). Typically this would require uusing Val{N} or Val{H} but even then it can fail sometimes.

Do you simply mean that

@unroll for j in 1:4; do_this(); end

is OK but

N=4
@unroll for j in 1:N; do_this(); end

is not?

Both are fine the way you have written them, because even in the second case the compiler is able to infer that N is always 4, the way you've written it. But @unroll for i = 1:grid.Nx is not fine because grid.Nx is not known at compile time, it is passed into the function as a property of the grid. At compile time, only the type of the grid is known, and not the values that are contained in it.

If one is careful to pass the limits of the loop as compile-time information, then we can pass information into a function. Typically this is done with objects like like Val(N) which have type signature ::Val{N}. Since here N is type information it is known to the compiler.

glwagner commented 11 months ago

Can you give an example that's OK and one that's not?

The example that is ok is when the limit of the loop N is passed in via an argument with type Val{N}. Then N is known to the compiler. This is what I tried to indicate, sorry for not being clear.

navidcy commented 10 months ago

Gotcha!

glwagner commented 10 months ago

Here's the basic structure

Not ok because loop limits are runtime values:

function loop(N)
    @unroll for i = 1:N
    # etc.
end

Maybe ok because, in principle, loop limit is encoded in type information

function loop(::Val{N}) where N
    @unroll for i = 1:N
    # etc.
end

The second case is called with loop(Val(N)).

navidcy commented 9 months ago

see https://github.com/CliMA/Oceananigans.jl/pull/3403#issuecomment-1927360091 regarding https://github.com/cstjean/Unrolled.jl