CliMA / Oceananigans.jl

🌊 Julia software for fast, friendly, flexible, ocean-flavored fluid dynamics on CPUs and GPUs
https://clima.github.io/OceananigansDocumentation/stable
MIT License
990 stars 194 forks source link

`outputinfo(filename)` for inspecting output without loading data #3859

Open glwagner opened 1 week ago

glwagner commented 1 week ago

Another utility that I believe is needed is a function that displays the information in an output file. For example something like

julia> outputinfo(filename)

which displays things like

anything else?

Originally posted by @glwagner in https://github.com/CliMA/Oceananigans.jl/issues/3793#issuecomment-2395088313

glwagner commented 1 week ago

@ali-ramadhan you mentioned that the show method for FieldDataset has something similar. What does this look like?

navidcy commented 1 week ago

Suggestion: If the filename loaded is, e.g., a _part1.jld2 then it would be of interest to detail that and whether this is part1 out of X?

glwagner commented 1 week ago

Yes though I don't completely understand --- if the user writes "part 1", would they not know its part 1?

Maybe this should work if "part1" is not provided? It would also be nice, separately, if FieldTimeSeries could construct a continuous time series.

navidcy commented 1 week ago

Hm... true. Perhaps this is a different discussion, but probably we want an output without any _partX ending in the series of output files with splitting? (It's definitely a different discussion and when that's settled we can revisit the outputinfo(filename)! -- What you suggested above sounds good!)

ali-ramadhan commented 1 week ago

@glwagner It's this show method: https://github.com/CliMA/Oceananigans.jl/blob/03b8acf4f378eeefdb5e79ceeafcf29fa711e94c/src/OutputReaders/field_dataset.jl#L61-L72

and looks something like this:

FieldDataset with 9 fields and 0 metadata entries:
├── v: 865×421×1×5761 FieldTimeSeries{OnDisk} located at (Center, Face, Center) of v at /home/alir/test/simulation_surface_slices.jld2
├── S: 865×420×1×5761 FieldTimeSeries{OnDisk} located at (Center, Center, Center) of S at /home/alir/test/simulation_surface_slices.jld2
├── w: 865×420×1×5761 FieldTimeSeries{OnDisk} located at (Center, Center, Face) of w at /home/alir/test/simulation_surface_slices.jld2
├── T: 865×420×1×5761 FieldTimeSeries{OnDisk} located at (Center, Center, Center) of T at /home/alir/test/simulation_surface_slices.jld2
├── Alk: 865×420×1×5761 FieldTimeSeries{OnDisk} located at (Center, Center, Center) of Alk at /home/alir/test/simulation_surface_slices.jld2
├── DIC: 865×420×1×5761 FieldTimeSeries{OnDisk} located at (Center, Center, Center) of DIC at /home/alir/test/simulation_surface_slices.jld2
├── u: 866×420×1×5761 FieldTimeSeries{OnDisk} located at (Face, Center, Center) of u at /home/alir/test/simulation_surface_slices.jld2
├── pCO₂: 865×420×1×5761 FieldTimeSeries{OnDisk} located at (Center, Center, Center) of pCO₂ at /home/alir/test/simulation_surface_slices.jld2
└── CO₂_surface_flux: 865×420×1×5761 FieldTimeSeries{OnDisk} located at (Center, Center, ⋅) of CO₂_surface_flux at /home/alir/test/simulation_surface_slices.jld2

so it's missing the the output times and perhaps more grid information. But in general it could be improved. I just ended up calling summary(fts) for each FieldTimeseries.

glwagner commented 1 week ago

Is FieldDataset designed so that every field should have the same times, grid, etc? (I guess we can ask the same question for a JLD2 file.) It'd be nice to be able to assume this.

ali-ramadhan commented 1 week ago

It was indeed designed just to provide convenient access to all FieldTimeSeries from one JLD2 file.

This assumption could of course be relaxed as FieldDataset doesn't rely on grid or times, but this is how I've been using it. Same with JLD2 files for me. Surface fields get one file, 3D fields another, zonal slices another, etc.