Closed Balinus closed 6 months ago
You can use the Not
selector with a vector of the leap days either with an inner At or Near selector.
julia> winter = r1[Ti=Not(At([DateTime(1995, 4,16),DateTime(1995,10,16,12)]))]
julia> typeof(winter.data)
DiskArrays.SubDiskArray{Union{Missing, Float32}, 3}
You have to be careful to match the Date or DateTime type with the type of your time axis. This subsetting will give you a lazy SubDiskArray which handles the subsetting via the metadata without having to load the data to memory.
Thanks! I was about to write a potentiel solution that seems to work no matter the type of Date/DateTime
ref_tasmax[Time=Dates.monthday.(lookup(ref_tasmax, :Ti)) .!= [(2,29)]]
Seems to be lazy and worked with dataset with regular Datetime and DateTimeNoLeap calendars. Then, I will need to replace the axis with a new type (e.g. from DateTime
to DateTimeNoLeap
). I guess using the rebuild
function?
Does this code make sense?
EDIT
Here's what I got, seems to work, not sure about lazyness, but it seems right when I look at @time
ouputs.
using YAXArrays
using CFTime
function drop29thfeb(ds)
ds_subset = ds[Time=Dates.monthday.(lookup(ds, :Ti)) .!= [(2,29)]]
date_vec = lookup(ds_subset, :Ti)
# New time vector
datevec_noleap = CFTime.reinterpret.(DateTimeNoLeap, date_vec)
axlist = (
Dim{:lon}(lookup(ds_subset,:lon)),
Dim{:lat}(lookup(ds_subset,:lat)),
Dim{:Time}(datevec_noleap),
)
return YAXArray(axlist, ds_subset.data)
end
That looks good. You could replace your subsetting with a Where selector like this:
ds[Ti=Where(x-> Dates.monthday(x) != (2,29))]
but I am not sure whether there is much of a difference. I think there might also a way to only rebuild the time axis and leave the other axes as is. But this is some DimensionalData function that I would also need to lookup. You might want to not rebuild the lon and lat axis like that but use `dims(ds_subset, :lon) instead this way you get the whole axis with all of the correct reversings and so on.
Could you add this example to the docs. That would be really nice.
That looks good. You could replace your subsetting with a Where selector like this:
ds[Ti=Where(x-> Dates.monthday(x) != (2,29))]
but I am not sure whether there is much of a difference. I think there might also a way to only rebuild the time axis and leave the other axes as is. But this is some DimensionalData function that I would also need to lookup. You might want to not rebuild the lon and lat axis like that but use `dims(ds_subset, :lon) instead this way you get the whole axis with all of the correct reversings and so on.
Could you add this example to the docs. That would be really nice.
Thanks! Yes, I will include it. Somewhere specific in the documentation?
Also, I'm unsure how the "Time" dimension (Ti
?) is handled. It seems to be a "special" dimensions and when I rebuild the time dimension now behaves as a regular dimension. Sometimes, the time dimension is accessible with Ti
, sometimes not.
For example:
using Zarr, YAXArrays, Dates, DimensionalData
store = "gs://cmip6/CMIP6/ScenarioMIP/DKRZ/MPI-ESM1-2-HR/ssp585/r1i1p1f1/3hr/tas/gn/v20190710/"
g = open_dataset(zopen(store, consolidated=true))
YAXArray Dataset
Shared Axes:
()
Variables:
height,
tas
with dimensions:
Dim{:lon} Sampled{Float64} 0.0:0.9375:359.0625 ForwardOrdered Regular Points,
Dim{:lat} Sampled{Float64} Float64[-89.28422753251364, -88.35700351866494, …, 88.35700351866494, 89.28422753251364] ForwardOrdered Irregular Points,
Ti Sampled{DateTime} DateTime[2015-01-01T03:00:00, …, 2101-01-01T00:00:00] ForwardOrdered Irregular Points
Properties: Dict{String, Any}("initialization_index" => 1, "realm" => "atmos", "variable_id" => "tas", "external_variables" => "areacella", "branch_time_in_child" => 60265.0, "data_specs_version" => "01.00.30", "history" => "2019-07-21T06:26:13Z ; CMOR rewrote data to be consistent with CMIP6, CF-1.7 CMIP-6.2 and CF standards.", "forcing_index" => 1, "parent_variant_label" => "r1i1p1f1", "table_id" => "3hr"…)
The printed dataset suggest using Ti
. But:
g.Ti
ArgumentError: Ti not found in Dataset
Stacktrace:
[1] getindex(x::Dataset, i::Symbol)
@ YAXArrays.Datasets ~/.julia/packages/YAXArrays/no0he/src/DatasetAPI/Datasets.jl:144
[2] getproperty(x::Dataset, k::Symbol)
@ YAXArrays.Datasets ~/.julia/packages/YAXArrays/no0he/src/DatasetAPI/Datasets.jl:141
[3] top-level scope
@ In[6]:1
while using time
works:
g.time
Ti DateTime[2015-01-01T03:00:00, …, 2101-01-01T00:00:00]
However, using the lookup
method, the inverse is found:
lookup(Cube(g), :Ti)
Sampled{DateTime} ForwardOrdered Irregular DimensionalData.Dimensions.LookupArrays.Points
wrapping: 251288-element Vector{DateTime}:
2015-01-01T03:00:00
2015-01-01T06:00:00
2015-01-01T09:00:00
2015-01-01T12:00:00
⋮
2100-12-31T15:00:00
2100-12-31T18:00:00
2100-12-31T21:00:00
2101-01-01T00:00:00
lookup(Cube(g), :time)
MethodError: no method matching lookup(::Nothing)
Closest candidates are:
lookup(::Union{Val{<:DimensionalData.Dimensions.Dimension}, Type{<:DimensionalData.Dimensions.Dimension}})
@ DimensionalData ~/.julia/packages/DimensionalData/R7veM/src/Dimensions/dimension.jl:176
lookup(::DimensionalData.Dimensions.AnonDim)
@ DimensionalData ~/.julia/packages/DimensionalData/R7veM/src/Dimensions/dimension.jl:383
lookup(::DimensionalData.Dimensions.Dimension{<:AbstractArray})
@ DimensionalData ~/.julia/packages/DimensionalData/R7veM/src/Dimensions/dimension.jl:175
...
Stacktrace:
[1] lookup(ds::Tuple{Dim{:lon, DimensionalData.Dimensions.LookupArrays.Sampled{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Float64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}, Dim{:lat, DimensionalData.Dimensions.LookupArrays.Sampled{Float64, Vector{Float64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Irregular{Tuple{Nothing, Nothing}}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}, Ti{DimensionalData.Dimensions.LookupArrays.Sampled{DateTime, Vector{DateTime}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Irregular{Tuple{Nothing, Nothing}}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}}, I::Symbol)
@ DimensionalData.Dimensions ~/.julia/packages/DimensionalData/R7veM/src/Dimensions/dimension.jl:233
[2] lookup(A::YAXArray{Float32, 3, ZArray{Float32, 3, Zarr.BloscCompressor, GCStore}, Tuple{Dim{:lon, DimensionalData.Dimensions.LookupArrays.Sampled{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Float64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}, Dim{:lat, DimensionalData.Dimensions.LookupArrays.Sampled{Float64, Vector{Float64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Irregular{Tuple{Nothing, Nothing}}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}, Ti{DimensionalData.Dimensions.LookupArrays.Sampled{DateTime, Vector{DateTime}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Irregular{Tuple{Nothing, Nothing}}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}}}, args::Symbol)
@ DimensionalData ~/.julia/packages/DimensionalData/R7veM/src/array/array.jl:62
[3] top-level scope
@ In[16]:1
Hello!
I'd like to convert a given dataset from a "regular" calendar to a DateTimeNoLeap calendar (and dropping all 29th Feb data along the way).
Is there a way to do it? I tried to find a solution, but they all involved loading the data in-memory, checking datetime vector, dropping the indexes associated with Feb 29th and then rebuilding a Dataset. I'd like it to be lazy because most of the datasets are in the 20GB range.
Thanks for any pointer!