Open dominicroye opened 1 month ago
It seems that applying a filter with only single pressure level works, but when combined with filtering by date, we get this error. When I swapped the order of the filters, I got this error:
Error in make_intervals(x$start[i], x$end[i]) :
length(start) > 0 is not TRUE
If you are looking for an alternative approach, filtering with square brackets can be applied this way:
plev = st_get_dimension_values(hu, "plev")
time = st_get_dimension_values(hu, "time")
plev_idx = which(plev == units::set_units(850*100, Pa))
time_idx = which(time <= as.POSIXct("1950-06-01 12:00:00", tz = "UTC"))
hu[,,, plev_idx, time_idx]
# stars object with 4 dimensions and 1 attribute
# attribute(s), summary of first 1e+05 cells:
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# Relative Humidity [%] 0.03111257 48.86616 70.07285 63.57896 81.61889 124.086
# dimension(s):
# from to offset delta refsys values x/y
# x 1 512 -0.3516 0.7031 NA NULL [x]
# y 1 256 89.81 -0.7017 NA NULL [y]
# plev 2 2 NA NA udunits [85000,70000) [Pa]
# time 1 152 1950-01-01 12:00:00 UTC 1 days POSIXct NULL
Yes, this solution is working. Thanks!!
I just noticed that it works only without using stars_proxy.
hu <- read_stars("hur_day_EC-Earth3_historical_r1i1p1f1_gr_19500101-19501231.nc", proxy = T)
plev = st_get_dimension_values(hu, "plev")
time = st_get_dimension_values(hu, "time")
plev_idx = which(plev == units::set_units(850*100, Pa))
time_idx = which(time <= as.POSIXct("1950-06-01 12:00:00", tz = "UTC"))
hu <- hu[,,, plev_idx, time_idx, drop = T]
> write_mdim(hu, "test.nc")
Error in names(dim(x[[i]])) <- names(d) :
attempt to set an attribute on NULL
This seems to now work:
library(stars)
hu <- read_mdim("hur_day_EC-Earth3_historical_r1i1p1f1_gr_19500101-19501231.nc", proxy = T)
plev = st_get_dimension_values(hu, "plev")
time = st_get_dimension_values(hu, "time")
plev_idx = which(plev == units::set_units(850*100, Pa))
time_idx = which(time <= as.Date("1950-06-01"))
hu <- hu[,,, plev_idx, time_idx, drop = T]
write_mdim(hu, "/tmp/test.nc")
Yes, it worked. Thanks. But as far as I see, it doesn't work with read_stars()
? The thing is that read_mdim()
doesn't read multiple files as time series.
The thing is that read_mdim() doesn't read multiple files as time series.
Can you give an example where that doesn't work?
The example you pasted here first (I saw it only in my email) is what I searched for, but as you say, I would need it as a proxy. I tried out y = do.call(c, lapply(file_list, read_mdim, proxy = T))
and it seems to work.
library(stars)
x = c(
"[avhrr-only-v2.19810901.nc](http://avhrr-only-v2.19810901.nc/)",
"[avhrr-only-v2.19810902.nc](http://avhrr-only-v2.19810902.nc/)",
"[avhrr-only-v2.19810903.nc](http://avhrr-only-v2.19810903.nc/)",
"[avhrr-only-v2.19810904.nc](http://avhrr-only-v2.19810904.nc/)",
"[avhrr-only-v2.19810905.nc](http://avhrr-only-v2.19810905.nc/)",
"[avhrr-only-v2.19810906.nc](http://avhrr-only-v2.19810906.nc/)",
"[avhrr-only-v2.19810907.nc](http://avhrr-only-v2.19810907.nc/)",
"[avhrr-only-v2.19810908.nc](http://avhrr-only-v2.19810908.nc/)",
"[avhrr-only-v2.19810909.nc](http://avhrr-only-v2.19810909.nc/)"
)
# see the second vignette:
# install.packages("starsdata", repos = "http://pebesma.staff.ifgi.de/", type = "source")
file_list = system.file(paste0("netcdf/", x), package = "starsdata")
# (y = read_stars(file_list, quiet = TRUE))
(y = do.call(c, lapply(file_list, read_mdim)))
Another issue I found now is in st_crop()
. As far as I see, it is related to colrow_from_xy()
and the step obj[[xy[1]]]$values
, which in turn has null values, although lon values are available when I use st_get_dimension_values(hu, "lon")
.
library(stars)
hu <- read_mdim("hur_day_EC-Earth3_historical_r1i1p1f1_gr_19500101-19501231.nc", proxy = T)
plev = st_get_dimension_values(hu, "plev")
time = st_get_dimension_values(hu, "time")
plev_idx = which(plev == units::set_units(850*100, Pa))
time_idx = which(time <= as.Date("1950-06-01"))
hu <- hu[,,, plev_idx, time_idx, drop = T] # it seems not to drop the plev dimension?!
hu <- st_set_dimensions(hu, "lon", values = st_get_dimension_values(hu, "lon")-180)
# stars_proxy object with 1 attribute in 1 file(s):
# $hur
# [1] "[...]/hur_day_EC-Earth3_historical_r1i1p1f1_gr_19500101-19501231.nc"
#
# dimension(s):
# from to offset delta refsys values x/y
# lon 1 512 -180 0.7031 NA NULL [x]
# lat 1 256 NA NA NA [-90,-89.11489),...,[89.11489,90) [y]
# plev 2 2 NA NA udunits [85000,70000) [Pa]
# time 1 151 1950-01-01 1 days Date NULL
st_crop(hu, st_bbox(c(xmin = -20, ymin = 25, xmax = 20, ymax = 50)))
# Error in as_intervals(ix, add_last = length(ix) == dim(obj)[xy[1]]) :
# is.atomic(x) is not TRUE
Thank you for your help.
Yes, but read_mdim(file_list)
also works, even with proxy=TRUE
. I'll look into you other issue.
hu <- st_set_dimensions(hu, "lon", values = st_get_dimension_values(hu, "lon")-180)
now generates an error, as dimensions are being reread and overwritten in st_as_stars()
. The crop box would then need to be moved to be within 0...360. It breaks because the lat
variabe is of type intervals (i.e., a rectilinear grid, where latitude values are irregular). If you would read this file using read_stars
(i.e., using the "classic" GDAL interface) then it would reorder the latitude values to be equally spaced, with some warnings:
> hu <- read_stars("hur_day_EC-Earth3_historical_r1i1p1f1_gr_19500101-19501231.nc", proxy = T)
Warning messages:
1: In CPL_get_metadata(file, NA_character_, options) :
GDAL Message 1: Latitude grid not spaced evenly. Setting projection for grid spacing is within 0.1 degrees threshold.
2: In CPL_read_gdal(as.character(x), as.character(options), as.character(driver), :
GDAL Message 1: Latitude grid not spaced evenly. Setting projection for grid spacing is within 0.1 degrees threshold.
> st_crop(hu, st_bbox(c(xmin = 20, ymin = 25, xmax = 40, ymax = 50))) |> st_as_stars()
stars object with 4 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
Min. 1st Qu. Median Mean 3rd Qu. Max.
Relative Humidity [%] 0.002519234 0.7560223 19.43109 31.13065 58.62669 114.6512
dimension(s):
from to offset delta refsys
x 29 58 -0.3516 0.7031 NA
y 57 93 89.81 -0.7017 NA
plev 1 8 NA NA udunits
time 1 365 1950-01-01 12:00:00 UTC 1 days POSIXct
values x/y
x NULL [x]
y NULL [y]
plev [1e+05,85000) [Pa],...,[1000,-3000) [Pa]
time NULL
Warning messages:
1: In CPL_get_metadata(file, NA_character_, options) :
GDAL Message 1: Latitude grid not spaced evenly. Setting projection for grid spacing is within 0.1 degrees threshold.
2: In CPL_read_gdal(as.character(x), as.character(options), as.character(driver), :
GDAL Message 1: Latitude grid not spaced evenly. Setting projection for grid spacing is within 0.1 degrees threshold.
In this case, which is the best way to change from longitude 360 to 180?
I still get the following error when I include the rest of the steps even with read_stars(),
but I also have issues with read_mdim().
With read_stars(): Shouldn't be dropped dimension with only one level?
hu <- read_stars("hur_day_EC-Earth3_historical_r1i1p1f1_gr_19500101-19501231.nc", proxy = T)
plev = st_get_dimension_values(hu, "plev")
time = st_get_dimension_values(hu, "time")
plev_idx = which(plev == units::set_units(850*100, Pa))
time_idx = which(time <= as.Date("1950-06-01"))
hu_sub <- hu[,,, plev_idx, time_idx, drop = T]
st_crop(hu_sub, st_bbox(c(xmin = -20+180, ymin = 25, xmax = 20+180, ymax = 50))) |> st_as_stars()
# Error in dim(ret[[i]]) <- dim(new_dim) :
# dims [product 318459] do not match the length of object [6158280]
st_crop(hu, st_bbox(c(xmin = -20+180, ymin = 25, xmax = 20+180, ymax = 50))) |> st_as_stars()
# stars object with 4 dimensions and 1 attribute
# attribute(s), summary of first 1e+05 cells:
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# Relative Humidity [%] -0.03696942 0.3956111 22.86977 33.64571 65.41913 112.589
# dimension(s):
# from to offset delta refsys values x/y
# x 229 285 -0.3516 0.7031 NA NULL [x]
# y 57 93 89.81 -0.7017 NA NULL [y]
# plev 1 8 NA NA udunits [1e+05,85000) [Pa],...,[1000,-3000) [Pa]
# time 1 365 1950-01-01 12:00:00 UTC 1 days POSIXct NULL
# Warning messages:
# 1: In CPL_get_metadata(file, NA_character_, options) :
# GDAL Message 1: Latitude grid not spaced evenly. Setting projection for grid spacing is within 0.1 degrees threshold.
# 2: In CPL_read_gdal(as.character(x), as.character(options), as.character(driver), :
# GDAL Message 1: Latitude grid not spaced evenly. Setting projection for grid spacing is within 0.1 degrees threshold.
read_mdim() using 360º lon, both errors are the same with or without the steps
hu <- read_mdim("hur_day_EC-Earth3_historical_r1i1p1f1_gr_19500101-19501231.nc", proxy = T)
plev = st_get_dimension_values(hu, "plev")
time = st_get_dimension_values(hu, "time")
plev_idx = which(plev == units::set_units(850*100, Pa))
time_idx = which(time <= as.Date("1950-06-01"))
hu_sub <- hu[,,, plev_idx, time_idx, drop = T]
hu_sub
# stars_proxy object with 1 attribute in 1 file(s):
# $hur
# [1] "[...]/hur_day_EC-Earth3_historical_r1i1p1f1_gr_19500101-19501231.nc"
#
# dimension(s):
# from to offset delta refsys values x/y
# lon 1 512 -0.3516 0.7031 NA NULL [x]
# lat 1 256 NA NA NA [-90,-89.11489),...,[89.11489,90) [y]
# plev 2 2 NA NA udunits [85000,70000) [Pa]
# time 1 151 1950-01-01 1 days Date NULL
st_crop(hu_sub, st_bbox(c(xmin = -20+180, ymin = 25, xmax = 20+180, ymax = 50))) |> st_as_stars()
# Error in as_intervals(ix, add_last = length(ix) == dim(obj)[xy[1]]) :
# is.atomic(x) is not TRUE
st_crop(hu, st_bbox(c(xmin = -20+180, ymin = 25, xmax = 20+180, ymax = 50))) |> st_as_stars()
# Error in as_intervals(ix, add_last = length(ix) == dim(obj)[xy[1]]) :
# is.atomic(x) is not TRUE
With read_stars(): Shouldn't be dropped dimension with only one level?
Arguably no, because you loose information (the value of that single level). You can put an adrop()
in the pipeline to have it dropped (probably after reading in memory).
read_mdim() using 360º lon, both errors are the same with or without the steps
Yes, I'll look into either fixing this or making the error message more helpful.
In this case, which is the best way to change from longitude 360 to 180?
Tell the data producers to do so?
More seriously,
In this case, which is the best way to change from longitude 360 to 180?
I managed to do so outside R, with cdo,
cdo -sellonlatbox,-180,180,-90,90 hur_day_EC-Earth3_historical_r1i1p1f1_gr_19500101-19501231.nc out.nc
then
r = read_mdim("out.nc")
plot(adrop(r[,,,1,1]),axes = TRUE)
gives
But to be honest, it took me quite some time to get cdo
installed with NetCDF-4 support; I used
./configure --enable-netcdf4 --enable-zlib --with-netcdf=/usr/ --with-hdf5=/usr/hdf5/
make
sudo make install
Ok. Thanks. Terra has the rotate() function for this kind of operation. Regarding telling the data producers to do so, climate projections are run with 360º by default as standard.
climate projections are run with 360º by default as standard.
Yes, I discussed this with @Nowosad over lunch, and given that these are all discrete global grids, this whole shifting or rotating should not be needed; the reason it's needed is that much of our software stacks still tend to treat geographic coordinates as Cartesian coordinates. I think that needs to change. (Terra will also not preserve the irregular latitude values.)
I found another possibility, but as proxy_stars, it doesn't work. For me, the grid is regular, but the only difference is in longitude 360º.
hu <- read_stars("hur_day_EC-Earth3_historical_r1i1p1f1_gr_19500101-19501231.nc", proxy = T)
plev = st_get_dimension_values(hu, "plev")
time = st_get_dimension_values(hu, "time")
plev_idx = which(plev == units::set_units(850*100, Pa))
time_idx = which(time <= as.Date("1950-06-01"))
hu_sub <- hu[,,, plev_idx, time_idx, drop = T]
st_crs(hu_sub) <- "+proj=longlat +datum=WGS84 +pm=360dw"
# working
st_crs(hu_sub) <- "+proj=longlat +datum=WGS84 +pm=360dw"
st_as_stars(hu_sub) %>% st_transform(hu_sub, 4326)
# not working
st_transform(hu_sub, 4326)
# Error: Not compatible with requested type: [type=list; target=double].
I thought that I could use st_crop()
then as usally, but ...
st_as_stars(hu_sub) %>% st_transform(4326) %>%
st_crop(st_bbox(c(xmin = -20, ymin = 25, xmax = 20, ymax = 50), crs = 4326))
# stars object with 3 dimensions and 1 attribute
# attribute(s), summary of first 1e+05 cells:
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# hur [%] 0.03111257 55.60625 74.12261 67.01967 83.88524 111.4188
# dimension(s):
# from to offset delta refsys values x/y
# lon 1 512 NA NA WGS 84 [512x256] -180,...,179.3 [x]
# lat 1 256 NA NA WGS 84 [512x256] -89.56,...,89.56 [y]
# time 1 151 1950-01-01 1 days Date NULL
# curvilinear grid
# Warning messages:
# 1: In st_crop.stars(., st_bbox(c(xmin = -20, ymin = 25, xmax = 20, :
# st_crop: bounding boxes of x and y do not overlap
# 2: In st_crop.stars(., st_bbox(c(xmin = -20, ymin = 25, xmax = 20, :
# crop only crops regular grids: maybe use st_warp() first?
I get a dimension error when I want to filter a single pressure level from my ncdf. I can filter the time dimension without any issues. Data example can be downloaded here.