Closed gdkrmr closed 1 year ago
Yes, this also something that I observe from time to time. Related issue https://github.com/JuliaDataCubes/YAXArrays.jl/issues/47
In your example Cube
only keeps the variables with the same dimensions, which makes sense, @meggart ?. The others are discarded. The way to open this file is via open_dataset
, as in
g = open_dataset(zopen(s3path, consolidated=true, fill_as_missing=false))
and this one contains all the information.
The issue is a change in eltype
because some some of the datasets have an offset and scale factor and get wrapped into a DiskArrayTools.CFDiskArray
which changes the eltype
from Float32
to Float64
j. Details in meggart/DiskArrayTools.jl#15 and meggart/DiskArrayTools.jl#16.
I have just checked and Cube
is not fixed yet.
fixed now ;-)
for your cube I still get the 9 difference: [lon, lat, time are axis, so, those should not count]
using DiskArrayTools, YAXArrays, Zarr
s3path = "http://data.rsc4earth.de:9000/earthsystemdatacube/v3.0.1/esdc-8d-0.25deg-256x128x128-3.0.1.zarr"
c3 = Cube(s3path);
z3 = Zarr.zopen(s3path, consolidated=true, fill_as_missing=false);
symdiff(c3.axes[4].values, string.(keys(z3.arrays)))
9-element Vector{String}:
"sensible_heat"
"latent_energy"
"time"
"terrestrial_ecosystem_respiration"
"lon"
"net_radiation"
"lat"
"burnt_area"
"net_ecosystem_exchange"
with these versions:
(tmp) pkg> st
Status `~/Documents/tmp/Project.toml`
[fcd2136c] DiskArrayTools v0.1.6 `https://github.com/gdkrmr/DiskArrayTools.jl.git#offsetpromotion`
[c21b50f5] YAXArrays v0.4.3 `https://github.com/JuliaDataCubes/YAXArrays.jl.git#master`
[0a941bbe] Zarr v0.8.0
(tmp) pkg>
you are right, seems like I still need to fix that. It works when using fill_as_missing = true.
I figured out the issue: a "_FillValue"
becomes the default missing value for CFDiskArray
and adds Missing
to its eltype
. I have added a commit but still need to test it.
I have added a commit but still need to test it.
Indeed. For your use case burnt_area is still missing.
4-element Vector{String}:
"time"
"lon"
"lat"
"burnt_area"
thanks for testing. burnt_area
is Float64, this is as bug in the DataCube.
Is this fixed by your PR in DiskArrayTools https://github.com/meggart/DiskArrayTools.jl/issues/16?
Yes, it should be
I have come across this issue several times now, Cube drops some variables.
Moved over from here: https://github.com/JuliaDataCubes/EarthDataLab.jl/issues/292