CliMA / Oceananigans.jl

🌊 Julia software for fast, friendly, flexible, ocean-flavored fluid dynamics on CPUs and GPUs
https://clima.github.io/OceananigansDocumentation/stable
MIT License
963 stars 190 forks source link

Various `FieldTimeSeries` bugs #3144

Open jagoosw opened 1 year ago

jagoosw commented 1 year ago

I've been using FieldTimeSeries a lot over the last few weeks and have noticed a couple of issues calculating means and constructing operations on them.

First, mean with dims only calculates it on the first time index (I think). For example, if I do mean(f, dims = (1, 2, 3)) it returns a size [1, 1, 1] array vs calculating mean(interior(f), dims = (1, 2, 3)) which returns a [1, 1, 1, Nt] array. I suspect it is also calculating means with no dimensions specified incorrectly.

Second, constructing operators on field time series fails and returns operators with dimensions [Nx, Ny, Nz] which if you then try and index into fails as it tries to index into the underlying fields which have a time dimension. For example, speed = √(u^2 + v^2 + w^2) gives speed with size [Nx, Ny, Nz], if I then try and index at [1, 1, 1] it throws a bounds error trying to access Nx x Ny x Nz x Nt array at 1, 1, 1.

Neither of these is particularly important but thought I would document in case anyone else has issues, and because I will try and fix them at some point.

For reference, I am currently getting around this by calculating means on the interior (which I think would produce the wrong results with immersed boundaries because it wouldn't have the masking step), and by just calculating arrays like speed = √(interior(u)^2 + interior(v)^2 + interior(w)^2) which doesn't lose me too much performance since I end up indexing into the whole array anyway.

### Tasks
glwagner commented 1 year ago

Thanks for bringing these up, I've also noticed a number of issues. Can you reformat your issue into a bulleted list? Maybe we can add more items.

glwagner commented 1 year ago

The implementation of mean is naive. Even if it does work for dims=: it is slow and should be improved (EDIT: there's a bit more wrong with it, see below)

https://github.com/CliMA/Oceananigans.jl/blob/19dac0b8f6b621057e250ecf0510b3c4c91915e9/src/OutputReaders/field_time_series.jl#L276-L295

I don't think this is only using the first time index as you suggest. But it is not right because it's always assuming you want to compute the mean across time. Heh.

The broader issue is that we don't support calculus and reductions in time and for example there is no ∂t operator.

What do you think we should do? Should we interpret mean(fts, dims=(1, 2, 3) as asking for a volume mean at every time, and thus return a [1, 1, 1, Nt] array?

We need to compute the proper mean in complex domains / immersed boundaries. interior(fts) doesn't do the right thing in that case.

ali-ramadhan commented 2 weeks ago

I know it's documented in the code but also: https://github.com/CliMA/Oceananigans.jl/blob/bf04295910ef02ca3d4105bee801070deef99175/src/OutputReaders/field_time_series_reductions.jl#L17

Would something simple like this work?

function reduce_4d(spatial_reduction::Function, temporal_reduction::Function, f::Function, fts::FTS4D; dims, kw...)
    reduced_snapshots = [
        spatial_reduction(f, fts[n]; dims=filter(d -> d != 4, dims), kw...)
        for n in 1:length(fts.times)
    ]
    return temporal_reduction(reduced_snapshots)
end

I guess we might want a similar interface to the 3D reductions but this should work for all the reductions in field_time_series.jl: (:sum, :maximum, :minimum, :all, :any, :prod).

It wouldn't work for things like median or quantile which need to act on the entire 4D array at once, but maybe these aren't actually reductions?

glwagner commented 2 weeks ago

Perhaps I should know this but is there a simple summary of why supporting reductions across the 4th dimension are hard?

ali-ramadhan commented 2 weeks ago

I don't know either lol, but you wrote that comment.

I think if you treat all the dimensions the same then 4D reductions aren't any different from 3D reductions. But I guess reductions on a FieldTimeSeries call reductions on Field which may assume 3D data?

In our case since the 4D data is 3 spatial dimensions and 1 temporal dimension we may like to apply a different spatial reduction and a different temporal reduction, but I don't think this is answering your question.

glwagner commented 2 weeks ago

Hmm... I guess reductions in the time-dimension should actually return a Field rather than a FieldTimeSeries. That's different than how reductions for Field work, which also return Field but with a "reduced location", eg called here

https://github.com/CliMA/Oceananigans.jl/blob/3c95e7ee6772a2597e206ae35202500be1ef5b32/src/Fields/field.jl#L710

and defined simply as

https://github.com/CliMA/Oceananigans.jl/blob/3c95e7ee6772a2597e206ae35202500be1ef5b32/src/Fields/field.jl#L629-L635

So maybe there is no special challenge except writing code similar to what we have for Field that's specialized for the time-dimension.

Reductions over time + other dimensions might work then, if the further reduction is done over the additional dimensions after the special case is handled that converts from FieldTimeSeries to Field.

Fields also support "conditioning" the operation (which mostly would probably be used to mask immersed areas). But we won't have a "time mask", so I think possibly that doesn't pose any particular problems.

Ok, then if we want to support reductions of FieldTimeSeries that dont act on the time dimension, that's where we basically have to replicate everything we have for Field I guess. For example if you average a FieldTimeSeries in x then you get a FieldTimeSeries back. That's a bit more annoying but still possible...