Closed dblodgett-usgs closed 5 years ago
Thanks for treading this relatively underexplored part of stars! I noted that giving one more dimension value in dimensions than the number of values in the matrix, as in your example, leads to lots of problems when converting e.g. to sf. Below I removed one, meaning the values are now interpreted as cell center (more or less!). Doing the thing right by specifying raster boundaries as in your example clearly needs some work.
Below some cheap fixes of things that I believe work now (with stars from github):
library(stars)
library(raster)
m <- matrix(1:20, nrow = 5, ncol = 4)
dim(m) <- c(x = 5, y = 4) # named dim
#x = c(0,0.5,1,2,4,5)
x = c(0,0.5,1,2,4)
#y = c(0.3,0.5,1,2,2.2)
y = c(0.3,0.5,1,2)
(r = st_as_stars(list(m = m),
dimensions = st_dimensions(x = x, y = y, .raster = c("x", "y"))))
st_xy2sfc(r, as_points = FALSE)
plot(st_as_sfc(r, as_points = FALSE))
plot(st_as_sf(r, as_points = FALSE), axes = TRUE)
image(r)
image(x, y, m, col = sf.colors(21))
ras <- raster(m)
# works
(ras_stars <- st_as_stars(ras))
(ras_stars_ireg <- st_as_stars(ras))
# dimensions ignored for class raster, as raster is always regular
(ras2 = structure(ras_stars_ireg,
dimensions = st_dimensions(x = x, y = y, .raster = c("x", "y"))))
# list method expects a matrix?
try(ras_stars_ireg2 <- st_as_stars(list(ras = ras),
dimensions = st_dimensions(x = x, y = y, .raster = c("x", "y"))))
# yes, this doesn't work, but
(ras_stars_ireg2 <- st_as_stars(list(ras = as.matrix(ras)),
dimensions = st_dimensions(x = x, y = y, .raster = c("x", "y"))))
# does
st_xy2sfc(ras_stars, as_points = FALSE)
st_as_sfc(ras_stars, as_points = FALSE)
st_as_sf(ras_stars, as_points = FALSE)
This makes the following code work (note the patch is to both sf
and stars
):
library(stars)
# Loading required package: abind
# Loading required package: sf
# Linking to GEOS 3.7.0, GDAL 2.3.2, PROJ 5.2.0
m <- matrix(1:20, nrow = 5, ncol = 4)
dim(m) <- c(x = 5, y = 4) # named dim
x = c(0,0.5,1,2,4,5)
y = c(0.3,0.5,1,2,2.2)
(r = st_as_stars(list(m = m),
dimensions = st_dimensions(x = x, y = y, .raster = c("x", "y"))))
# stars object with 2 dimensions and 1 attribute
# attribute(s):
# m
# Min. : 1.00
# 1st Qu.: 5.75
# Median :10.50
# Mean :10.50
# 3rd Qu.:15.25
# Max. :20.00
# dimension(s):
# from to offset delta refsys point values
# x 1 6 NA NA NA NA 0, ..., 5 [x]
# y 1 5 NA NA NA NA 0.3, ..., 2.2 [y]
dim(r)
# x y
# 5 4
dim(st_dimensions(r))
# x y
# 6 5
plot(st_as_sf(r, as_points = FALSE), axes = TRUE)
plot(st_as_sf(r, as_points = TRUE), axes = TRUE)
@mdsumner : note the differences in output from dim
-- we don't have a flag in the dimensions object that tells whether values
are raster cell centers or raster cell boundaries (of which we have dim(r)+1
). This leads to a somewhat confusing to
field, when arguing from the dimensions of m
. Do you have a suggestion how to make this simpler to users? Decrease to
with one, so that we can argue from there? Maybe: decrease to
with one so that to
- from
+ 1 smaller than length(values)
indicates values
must be cell boundaries?
Awesome, thanks @edzer. Will look at replacing my use of Raster here: https://github.com/dblodgett-usgs/intersectr/blob/master/R/create_cell_geometry.R#L106
Is there anything you need re: testing or documentation I could contribute? Just point me to where I should look.
Using image()
structure would make for some good examples. I can't do anything on this right now, but reading with interest! @dblodgett-usgs any examples you have, from files or otherwise will be helpful. I hadn't thought of users constructing directly from matrices, that is a good way to get diverse examples from various NetCDF files and elsewhere. It's definitely not a simple landscape
I'll have a slew of examples if we can make this work! I have to admit I'm a bit over my head with grid-topologies and the matrix
row/col x/y semantics being thrown around here. On that, @edzer -- in your example you treat rows like x and cols like y. In my work in NetCDF-land, rows correspond to the y coordinate variable and columns correspond to x. Feel like I'm constantly screwing that up though so happy to be corrected here.
The example you give above works but if I try to apply my own data to this pattern I'm getting errors.
My little test is not an irregularly spaced grid, but this should still work right? To make the test totally parallel to your sample, I pulled out the x and y cell count into the values 78 and 54 the vectors are 79 and 55 long. If I reduce the vectors to the same size as the matrix, I can get it to work.
So... ?
p.s. I was also seeing:
st_dimensions(x = x, y = y, .raster = c("x", "y"))
# Error in st_dimensions.array(x = x, y = y, .raster = c("x", "y")) :
# argument ".x" is missing, with no default
realizing the issue was with the array method, did this and it works:
st_dimensions(x = as.numeric(x), y = as.numeric(y), .raster = c("x", "y"))
🙏
One other nit...
sf <- st_as_sf(r, as_points = FALSE)
sf::st_crs(sf) <- st_crs("+proj=lcc +lat_1=25 +lat_2=60 +x_0=0 +y_0=0 +units=m +lat_0=42.5 +lon_0=-100 +a=6378137 +f=0.00335281066474748 +pm=0 +no_defs")
# OGR: Corrupt data
# Error in CPL_crs_equivalent(e1$proj4string, e2$proj4string) : OGR error
the crs
argument to st_as_stars
isn't taking and I can't seem to set the crs of my result. It appears to be something I'm doing wrong though because on the little 5X4 example, I can set the crs with st_crs() fine.
The above works, now, pending the last plot which should be
plot(st_as_sf(r, as_points = TRUE, merge = FALSE), axes = TRUE)
(or requires GDAL 2.4.0, to return a contour/polygons map) The error you saw with
# Error in st_dimensions.array(x = x, y = y, .raster = c("x", "y")) :
was caused by x
being an array, rather than a vector. The st_crs
should now also work.
There is still one issue when st_dimensions is fed with cell centers: the half cell shift of origin is not applied, currently.
With an option for the half cell size correction we now see this:
which I do not consider optimal, but at least documented.
@dblodgett-usgs : with respect to the rows=x and cols=y magic, yes, this takes a short while to get your head around, but once you do it's quite liberating: index and dimension order simply correspond to x and y. If you have your matrix with rows=y and cols=x, maybe define dimensions as y, x rather than x, y, and then use aperm
to reorder (if needed).
So great, @edzer! Nice use of HTML5 <details>
too, I didn't know about that.
I think I got this all figured out over here: https://github.com/dblodgett-usgs/intersectr/pull/1
The half cell issue has caused more problems over the years... it should almost be a required input that's categorical "center", "leading", or "trailing". Otherwise, someone will screw it up...
The whole x/y thing drives me kind of nuts. Over in intersectr
I'm calling inputs "col_coords"
and "row_coords"
instead of x and y now but am leaning toward something more like "nc_col_coords"
and "nc_row_coords"
to be super explicit about it. Thanks for the tip on aperm
, I'll keep that in mind.
I'll contribute some testing for this tomorrow and close this issue with a PR.
I like the idea of not having a default for cell_midpoints
, and by that forcing users to be explicit. I never came across "trailing" as an option in real life data, did you? (or are you thinking north-up / row-down y axes?)
No, I've not seen "trailing" either, just mentioned it because it felt complete. I think we could safely leave it off.
Side note on this... I found a case where one dimension passes as "regularly spaced" and another does not. I think this is due to noise in the lat/lon coordinate variable from a NetCDF file (?). If I redistribute the coordinates to be evenly spaced things work.
Is there a tolerance for considering something "regularly spaced"? Would that be something to put into an overridable input?
Doing a little experimentation with
stars
, I'm running into trouble getting a matrix I created coerced into a stars object that then can go tosf
without first usingraster()
. Using raster is fine for regularly spaced rasters, but what am I missing if I want to do irregularly spaced rectilinear or curvilinear grids?Using some demo code from one of the blog posts, I'm trying to get a non regularly spaced grid to work and failing... Is this just not implemented yet?
Code rolled up for clarity
```r library(stars) library(raster) m <- matrix(1:20, nrow = 5, ncol = 4) dim(m) <- c(x = 5, y = 4) # named dim x = c(0,0.5,1,2,4,5) y = c(0.3,0.5,1,2,2.2) (r = st_as_stars(list(m = m), dimensions = st_dimensions(x = x, y = y))) st_xy2sfc(r, as_points = FALSE) st_as_sfc(r, as_points = FALSE) st_as_sf(r, as_points = FALSE) image(r) image(x, y, m, col = sf.colors(21)) ras <- raster(m) # works (ras_stars <- st_as_stars(ras)) # dimensions ignored for class raster (ras_stars_ireg <- st_as_stars(ras, dimensions = st_dimensions(x = x, y = y))) # list method expects a matrix? (ras_stars_ireg2 <- st_as_stars(list(ras = ras), dimensions = st_dimensions(x = x, y = y))) st_xy2sfc(ras_stars, as_points = FALSE) st_as_sfc(ras_stars, as_points = FALSE) st_as_sf(ras_stars, as_points = FALSE) ```