`FieldTimeSeries` support for `NetCDF`

CliMA / Oceananigans.jl

🌊 Julia software for fast, friendly, flexible, ocean-flavored fluid dynamics on CPUs and GPUs

https://clima.github.io/OceananigansDocumentation/stable

MIT License

996 stars 195 forks source link

`FieldTimeSeries` support for `NetCDF` #3935

Open tomchor opened 1 week ago

tomchor commented 1 week ago

@ali-ramadhan and I are interesting in expanding FieldTimeSeries to support NetCDF.

None of us is super familiar with FTS, but it seems like, for the most part, all we need would be to figure out a way to reconstruct grids based on NetCDF output and figure out how to deal with memory access if the data is PartlyInMemory. Is that correct or are there other items that we pay attention to?

For the former we can output the relevant grid parameters to file and reconstruct it in a way that's already done for JLD2 whenever the grid can't be restored directly: https://github.com/CliMA/Oceananigans.jl/blob/82ad840f2b7e03d9b65cf004a216da76a9ec173d/src/OutputReaders/field_time_series.jl#L489-L529

We were also wondering if this is worth doing over one PR and a few small ones.

cc @glwagner

glwagner commented 1 week ago

For a very first PR, it is only necessary to figure out how to reconstruct the grid and location. We need the grid and location in order to do any meaningful post-processing, such as computing derivatives, integrals, etc, with the saved data. If indices are not supported, we can still build 3D data. Making it useful in the long term will probably require supporting indices though. Boundary conditions are less important, but that can be put on a longer term TODO list.

I would focus completely on one backend --- probably InMemory. We may want to refactor the backends a little bit. For example, partly in memory should be support automatically if we do this right.

glwagner commented 1 week ago

Small PRs are always better because they lead to better design and easier review.

ali-ramadhan commented 6 days ago

Maybe a good starting point would be to add enough functionality so that a RectilinearGrid can be reconstructed? And tests for a the reconstruction of a simple FieldTimeSeries. This could be checked pretty thoroughly with some unit tests.

Small PRs are always better because they lead to better design and easier review.

Definitely agree! I might instead say that a PR should have one clear purpose. Perhaps the narrower the better. I suppose even narrowly-scoped PRs can grow quite large though.

glwagner commented 6 days ago

Maybe a good starting point would be to add enough functionality so that a RectilinearGrid can be reconstructed?

That works! But don't reinvent the wheel or implement something too specific. I would start by looking at the existing functionality for reconstructing grids eg

https://github.com/CliMA/Oceananigans.jl/blob/82ad840f2b7e03d9b65cf004a216da76a9ec173d/src/Grids/rectilinear_grid.jl#L374-L394

I think you have to save the topology and location as strings. Does NetCDF support tuples?

glwagner commented 6 days ago

You could have constructor_arguments(filename::String) and then something like

function reconstruct_grid(filename)
    args, kwargs = constructor_arguments(filename)
    GridType = get_grid_type(filename)
    arch = args[:architecture]
    FT = args[:number_type]
    return GridType(arch, FT; kwargs...)
end

I think that will work for lat lon and rectilinear?

For OrthogonalSphericalShellGrid you probably just have to save all metrics, not sure a way around that.

glwagner commented 6 days ago

This is cleaner / nicer than saving all the metrics down and loading them back, because you might end up with loaded grids that don't match the original (eg ranges get converted to arrays?)