The-ReSolver / Fields.jl

0 stars 0 forks source link

Use explicit loops #14

Closed gasagna closed 2 years ago

gasagna commented 2 years ago

https://github.com/The-ReSolver/Fields.jl/blob/70e78c98be6fb942e7cefb6b75d85d47981708e9/src/derivatives.jl#L61

Use three nested loops and add @inbounds to the outermost loop. Same for the time derivatives.

Might also consider using @turbo from LoopVectorisation.jl.

tb6g16 commented 2 years ago

Replaced the views with inbounds as follows

@inbounds begin
        for nt in 1:Nt, nz in 1:((Nz >> 1) + 1), ny in 1:Ny
            dudz[ny, nz, nt] = (1im*(nz - 1)*β)*u[ny, nz, nt]
        end
    end

Increase of speed from

206.303 μs (0 allocations: 0 bytes)
140.275 μs (0 allocations: 0 bytes)
190.458 μs (0 allocations: 0 bytes)

to

149.189 μs (0 allocations: 0 bytes)
122.353 μs (0 allocations: 0 bytes)
149.763 μs (0 allocations: 0 bytes)

I'll keep this issue open for now to deal with @turbo at a later date.

gasagna commented 2 years ago

I have tried LoopVectorization.@turbo, but it cannot parse the for loop. I suppose this issue can be closed.