Closed Robinlovelace closed 7 years ago
Update - works (with warning) for sp objects:
y = as(x, "Spatial")
plot(y)
SpatialPoints(y)
class : SpatialPoints
features : 112
extent : 0, 5565.975, -7.081155e-10, 110823.7 (xmin, xmax, ymin, ymax)
coord. ref. : NA
Warning message:
In validityMethod(object) :
duplicate rownames are interpreted by rgeos as MultiPoints; use SpatialMultiPoints to define these; in future sp versions this warning will become an error
could sfc_MULTIPOINT
-> sfc_POINT
be part of st_cast? We now see
> st_cast(st_sfc(st_multipoint(matrix(1:6,,2))), "POINT")
Geometry set for 1 feature
geometry type: POINT
dimension: XY
bbox: xmin: 1 ymin: 4 xmax: 3 ymax: 6
epsg (SRID): NA
proj4string: NA
POINT(1 4)
Warning message:
In chk(x, "MULTIPOINT") :
object does not have length one: casting causes information loss
but maybe one could return all points? Or an sf
object with an integer attribute denoting the original ID
(nr) in the sfc_MULTIPOINT
?
I think returning all points would be good - I imagine the saved ID
could be useful in some instances. Not sure if that reply is useful...
I found a way to stop the warning in sp btw - just remove the repeated matrix row names:
Note also: I think I found a bug in sp: spTransform()
seems to fail on multipoint objects...
I think this is not a job for st_cast but a new function, because of the implied 1-to-many aspect (and more generally perhaps the many-to-1 possibilties within aggregate operations).
I expect st_cast to work within a mutate() call for example, which would be precluded if st_cast was exploding multis to single rather than "dropping".
I would go about it something like this, though there's probably more concise ways. This is all standard R data.frame/list idioms, I'm exploring similar techniques for the general cast case (without changing row numbers at the top level).
The following assumes there's no mixing, the input is all MULTIPOINT.
## mk multipoint worker
mkmpt <- function(n, mean = 0) st_multipoint(matrix(rnorm(n * 2, mean), ncol = 2))
## mk sf worker
mksf <- function(d, gc) {
stopifnot(nrow(d) == length(gc))
d[["geometry"]] <- gc
st_as_sf(d)
}
## drop geom
drop_geom <- function(x) {
as.data.frame(x)[-match(attr(x, "sf_column"), names(x))]
}
## create a data set
mp <- mksf(data.frame(a = c("a", "b"), b = 1:2), st_sfc(list(mkmpt(10), mkmpt(20, 5))))
## convert multipoint to point
## simple case (it's all multipoint)
mp_to_pt <- function(mp) {
## build map of grouping in geometry
idx <- unlist(lapply(st_geometry(mp), nrow))
## initialize replacement object
## expand out (by duplicating rows for every individual coordinate)
d <- drop_geom(mp)[rep(seq_len(nrow(mp)), idx), ]
rownames(d) <- NULL
## split/explode the multipoints and rebuild
d[["geometry"]] <- st_sfc(unlist(lapply(st_geometry(mp), function(a) lapply(split(a, seq_len(nrow(a))), st_point)), recursive = FALSE, use.names = FALSE),
crs = st_crs(mp))
st_as_sf(d)
}
## do the conversion, MULTIPOINT sf to POINT sf with auto-one-to-many expansion
mp_to_pt(mp)
I think there's a nice case for a general decomposition to rows-per part, but it's got complications - like holes being elevated to islands, so I'm not sure how much of this will belong in sf directly.
How about an st_un_multi
for all MULTI*
objects? It could also unfold GEOMETRYCOLLECTIONS
one level down. Or st_unwrap
, to take a verb?
Both sound good, but I like the one with multi
in it better as it makes the link to MULTI* features more clearly. I think st_unmulti
is another contender but less clear than your st_un_multi
suggestion.
Here's some draft code that un-multipoints:
st_un_multipoint = function(x) {
g = st_geometry(x)
i = rep(seq_len(nrow(x)), sapply(g, nrow))
x = x[i,]
st_geometry(x) = st_sfc(do.call(c,
lapply(g, function(geom) lapply(1:nrow(geom), function(i) st_point(geom[i,])))))
x$original_geom_id = i
x
}
# example:
library(sf)
g = st_sfc(st_multipoint(matrix(1:6,,2)),
st_multipoint(matrix(1:4,,2)),
st_multipoint(matrix(11:18,,2)))
df = data.frame(a = 1:3, b = 5:7)
st_geometry(df) = g
df
st_un_multipoint(df)
I think this is equivalent to st_points
in postgis
We now have:
> st_cast(x, "POINT")
Geometry set for 112 features
geometry type: POINT
dimension: XY
bbox: xmin: 0 ymin: -7.081155e-10 xmax: 5565.975 ymax: 110823.7
epsg (SRID): 3857
proj4string: +proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +wktext +no_defs
First 5 geometries:
POINT(0 501.464607505533)
POINT(0 1504.39382251802)
POINT(0 2507.3230375305)
POINT(0 3510.25225254298)
POINT(0 4513.18146755547)
> ?st_cast
> st_cast(x, "POINT", group_or_split = FALSE)
Geometry set for 2 features
geometry type: POINT
dimension: XY
bbox: xmin: 0 ymin: -7.081155e-10 xmax: 5565.975 ymax: 501.4646
epsg (SRID): 3857
proj4string: +proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +wktext +no_defs
POINT(0 501.464607505533)
POINT(5565.97453966368 -7.08115455161362e-10)
Warning messages:
1: In st_cast.MULTIPOINT(X[[i]], ...) : point from first coordinate only
2: In st_cast.MULTIPOINT(X[[i]], ...) : point from first coordinate only
As it is now, st_cast
by default does the one-to-many, and even (if you specify ids
) many-to-one with an aggregate
on the attributes. I will move the latter to a proper aggregate
method, but don't see how st_point
catches all the one-to-many casts -- I suggest to leave that in.
Update: this is still not working:
# test multipoint to point conversion
x = st_multipoint(matrix(1:10, ncol = 2))
st_cast(x = x, to = "POINT")
POINT(1 6)
Warning message:
In st_cast.MULTIPOINT(x = x, to = "POINT") :
point from first coordinate only
I was expecting 5 points, that can get ids associated. Equivalent sp code:
p = sp::SpatialPoints(matrix(1:10, ncol = 2))
How does one do that in sf?
From scratch:
d <- as.data.frame(matrix(1:10, ncol = 2))
d$id <- 1:5
st_as_sf(d, coords = c("V1", "V2"))
But the casting method also works if it's sfc
:
st_cast(st_sfc(x), "POINT")
Compare when the group_or_split
is FALSE (this argument is not available to all st_cast
methods):
st_cast(st_sfc(x), "POINT", group_or_split = FALSE)
Thanks for the clarification - I missed out the st_sfc
part - gradually making sense - seems like a logical system. I'd say this issue is closed then, although not sure about other feature types such as POLYGON
. For my use-case this issue is closed.
Question on this @edzer: how to create a POINT
sf
with 5 features based on this:
library(sf)
set.seed(1985)
m3 = matrix(runif(10), ncol = 2)
mp = st_multipoint(x = m3)
p = st_cast(x = mp, to = "POINT")
At the moment I'm getting this useful warning but no indication of how to make it include all points:
#> Warning in st_cast.MULTIPOINT(x = mp, to = "POINT"): point from first
You can't create a POINT
that includes more than one point. You can create a list of points, though:
p = st_cast(x = st_sfc(mp), to = "POINT")
Thanks - makes sense. This if for some documentation of the new aggregate
method you've created, which looks awesome. PR incoming!
Is there a PR, or one in the works, for st_un_multipoint?
Is st_cast(mp, "POINT")
not enough?
My bad - I thought from first reading the discussion that there was a problem with using casting for this case.
Reproducible example:
Is there a function, e.g.
as.st_points
that could convert this into just points? If not any ideas of the best way to go about creating such a function?