Closed mdsumner closed 4 years ago
It's the use of dplyr::distinct, this doesn't pick up differences in numbers like
" 2188902.156298386" " 2188902.156298387"
dplyr::distinct()
says that is duplicate, but factor(as.character(.))
sees it's different
Yet (presumably these are artificially different after formatting).
library(dplyr)
d <- tibble(x = c(2188902.156298386, 2188902.156298387))
d %>% distinct()
Those are from map_table(shapefile("cst10_polygon.shp"))$v[c(2302, 10986), ]
Need to use the gris brute force method (or something better) in spbabel::map_table:
normVerts <- function(v, nam) {
v$.vx0 <- as.integer(factor(do.call("paste", c(v[,nam], sep = "\r"))))
bXv <- v %>% dplyr::select(.vx0, .br0, .br_order)
v <- v %>% distinct(.vx0)
list(v = v, bXv = bXv)
}
This is for spbabel
Just dumping thoughts in old issues here for ya, but I'm churning through the silicate
code (which is soooo awesome, btw!), and have encountered one issue that is the precise obverse of this.
> library (silicate)
> library (dodgr)
> weight_streetnet (hampi) %>% dodgr_vertices () %>% nrow () # hampi = dodgr test data
# [1] 2861
> SC(hampi)$vertex %>% nrow ()
# [1] 2860
That's because the coordinates are identical, so SC
de-duplicates them, but the are not OSM identical. This often arises in vertical cases such as pedestrian ways crossing over or under ground-borne ways, but is an entirely general condition. Sometimes points where lines cross should not be considered as junctions, which silicate
can't currently handle. My easy solution will be to implement an option to allow all silicate
capital functions (SC
, ARC
, PATH
, ...) to optionally specify a user-defined ID
column to allow (otherwise apparent) duplication of vertices, edges, objects, whatever, as long as they have got separate ID
s.
That should go some way to resolving your concerns expressed in unjoin
I had to do this in anglr to support copy_down properly, and denormalizing (from unique x/y) takes a different approach for SEQuential models (ARC, PATH) than PRIMitive models (SC, TRI, DEL)
https://github.com/hypertidy/anglr/blob/master/R/de-normalize.R#L11
The definition of geometric uniqueness should always be optional and available, though in practice I've only had to work with x/y or x/y/z or x/y/z/t so far. This all needs a lot more attention!
Still a problem, but I have not seen this for a long time
## data_local/vector_cache/cst10_polygon.Rdata
d <- anglr::DEL0(cst10_polygon)
Internal error in segmentintersection():
Topological inconsistency after splitting a segment.
Please report this bug to jrs@cs.berkeley.edu
Include the message above, your input data set, and the exact
command line you used to run Triangle.
Error in (function (p, a = NULL, q = NULL, Y = FALSE, j = FALSE, D = FALSE, :
Triangle exit, code $i
Buffer = 0 doesn't fix it
Every polygon works fine in DEL0 when done in a loop separately. So this is a "mesh on entire set" problem.
And, as @mpadge says it's related to overlapping segments that aren't meant to be meshed together :) but here it's for different reasons (crappy digitization of a polygon). I think CGAL's robustness might allow this to work, but really it is crappy polygons.
This is enough to trigger
DEL0(cst10_polygon[c(131, 363), ])
Internal error in segmentintersection():
Topological inconsistency after splitting a segment.
Please report this bug to jrs@cs.berkeley.edu
Include the message above, your input data set, and the exact
command line you used to run Triangle.
Error in (function (p, a = NULL, q = NULL, Y = FALSE, j = FALSE, D = FALSE, :
Triangle exit, code $i
TRI0 is fine with it
TRI0(cst10_polygon[c(131, 363), ])
class : TRI0
type : Primitive
vertices : 8149 (2-space)
primitives : 8187 (2-space)
crs : +proj=stere +lat_0=-90 +lat_ts=-71 +lon_0=0 +k=1 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
plot(TRI0(cst10_polygon[c(131, 363), ]))
plot(cst10_polygon[131, ], add = TRUE, col = "firebrick")
Somewhere along this boundary
might just be this non-noded intersection
maybe
DEL0(rgeos::gNode(as(cst10_polygon[c(131, 363), ], "SpatialLines")))
class : DEL0
type : Primitive
vertices : 8098 (2-space)
primitives : 5883 (2-space)
crs : +proj=stere +lat_0=-90 +lat_ts=-71 +lon_0=0 +k=1 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
that's not enough to triangulate it though, it doesn't work well because gNode only gives one feature
and this doesn't work, though the resulting POLYGONs from sf do look ok
DEL0(sf::st_cast(sf::st_polygonize(sf::st_node(sf::st_cast(sf::st_as_sf(cst10_polygon[c(131, 363), ]), "MULTILINESTRING")))))
Internal error in segmentintersection():
Topological inconsistency after splitting a segment.
something to compare with CGAL, but it might be worth catching this error, and churning through TRI0 instead?
this works
plot(DEL0(sf::st_as_sfc(sf::st_as_binary(sf::st_set_precision(sf::st_as_sf(cst10_polygon[c(131, 363), ]), 1)$geometry))))
and preserves vertices it seems
DEL0(sf::st_as_sfc(sf::st_as_binary(sf::st_set_precision(sf::st_as_sf(cst10_polygon[c(131, 363), ]),.1)$geometry)))
class : DEL0
type : Primitive
vertices : 8048 (2-space)
primitives : 8191 (2-space)
crs : NA
> SC0(cst10_polygon[c(131, 363), ])
class : SC0
type : Structural
vertices : 8149 (2-space)
primitives : 8159 (2-space)
crs : +proj=stere +lat_0=-90 +lat_ts=-71 +lon_0=0 +k=1 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
up to 1e9
DEL0(sf::st_as_sfc(sf::st_as_binary(sf::st_set_precision(sf::st_as_sf(cst10_polygon[c(131, 363), ]),1e9)$geometry)))
class : DEL0
type : Primitive
vertices : 8223 (2-space)
primitives : 8563 (2-space)
crs : NA
> DEL0(sf::st_as_sfc(sf::st_as_binary(sf::st_set_precision(sf::st_as_sf(cst10_polygon[c(131, 363), ]),1e10)$geometry)))
Internal error in segmentintersection():
It's time I finally learnt this precision stuff enough, and seems relevant for @dcooley ?👍
I'll put this data set in anglr while I explore it a bit more (so don't bother with the details of this issue above ).
signif is enough (sf's precision is pretty counter intuitive I think)
cst10 <- readRDS(system.file("extdata/cst10_polygon.rds", package = "anglr", mustWork = TRUE))
library(silicate)
p <- PATH0(cst10)
# DEL0(p) ## fails
p$vertex$x_ <- signif(p$vertex$x_, 10)
p$vertex$y_ <- signif(p$vertex$y_, 10)
DEL0(p) ## works
10 is about 1mm in this metres projection - ultimately that's the issue I think, coordinates in metres with spurious precision , and this is something that should happen at the "find unique" step, at least be an option to control - don't know if there's any tools for this in R already, or if signif(x, 10)
will be enough for normal use?
DEL0(p, max_area = 1e10)
class : DEL0
type : Primitive
vertices : 15731 (2-space)
primitives : 23893 (2-space)
crs : +proj=stere +lat_0=-90 +lat_ts=-71 +lon_0=0 +k=1 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
mesh_plot(DEL0(p, max_area = 1e10), col = viridis::viridis(23893))
Oh @mpadge the "not OSM identical" thing is in "rgl::as.mesh3d" with its notEqual
arg, though motivated differently. Another similar case is keeping distinct in xyz, which quadmesh::dquadmesh
does fwiw
oh k, I understand better what sf's precision is doing now and how it's much more than rounding - and also that some thoughts I had about using this for detail reduction were pretty off-rail, but at any rate here's a wrapper for sf's precision setter to round-trip through binary and rebuild:
osf <- function(x, precision = NULL) {
if (is.null(precision)) {
return(x)
}
x <- sf::st_set_precision(x, precision)
sf::st_set_geometry(x, sf::st_as_sfc(sf::st_as_binary(sf::st_geometry(x))))
}
Here's a wrapper to re-map PATH0 indexes after rounding the vertex table coordinates (works in very limited ways to avoid the problem of uniqueness vs. spurious-precision):
renorm <- function(x, ...) {
## only PATH0 for now
vv <- sc_vertex(x)
vv$vertex_ <- 1:nrow(vv)
gi <- dplyr::group_indices(
dplyr::group_by(vv, x_, y_)
)
fun <- function(xx) {
xx$vertex_ <- vv$vertex_u[xx$vertex_]
xx
}
if (dim(vv)[1L] > length(unique(gi))) {
# print("jiggy")
vv$vertex_u <- vv$vertex_[gi]
x$object$path_ <- lapply(x$object$path_, fun)
#vv <- dplyr::distinct(vv, vertex_u, .keep_all = TRUE)
vv <- vv[!duplicated(vv$vertex_u), ]
vv$vertex_u <- NULL
vv$vertex_ <- NULL
x$vertex <- vv
}
x
}
I also explored snap-to-grid, which also works up to a point - what I think might be helpful for DEL/DEL0
is on this error to suggest using these tools - rebuild with sf precision - do a rounding or grid-snap of sc$vertex - and have a bit of a guide to doing this.
All very interesting, and also stuff that I need right now as well (for a dodgr
issue with coordinate identical-ness). One issue that i'm having is that signif
is really slow:
library (inline)
library (rbenchmark)
x <- 100 * rnorm (1e7)
# function to round to integer value of (x * precision)
cround <- cfunction(c(x = "numeric", precision = "integer"), "
const int n = length (x);
double *rx;
rx = REAL (x);
int *rprecision;
rprecision = INTEGER (precision);
SEXP result = PROTECT (allocVector (INTSXP, n));
int *rout;
rout = INTEGER (result);
for (int i = 0; i < n; i++)
rout [i] = rx [i] * rprecision [0];
UNPROTECT (1);
return result;
")
# r-requivalent:
rround <- function (x, precision)
round (x * precision)
precision = as.integer (1e8)
benchmark (
"signif" = s <- signif (x, digits = 10),
"round" = s <- round (x, digits = 10),
"cround as int" = s <- cround (x, precision = precision),
"cround as num" = s <- cround (x, precision = precision) / precision,
"rround" = s <- rround (x, precision),
replications = 1,
order = "relative",
columns = c ("test", "elapsed", "relative"))
#> test elapsed relative
#> 3 cround as int 0.022 1.000
#> 4 cround as num 0.070 3.182
#> 5 rround 0.146 6.636
#> 1 signif 0.474 21.545
#> 2 round 1.000 45.455
Created on 2020-05-15 by the reprex package (v0.3.0)
... but a C function is no solution, because it is bounded by .Machine$integer.max
, which is generally O(1e9), so the precision
value for this C function is limited to around 1e8, and definitely not 1e10 (and that's actually more like 8 digits in total, so precision much less for other CRSs with larger integer components). Any thoughts on that? I will likely implement a C-type routine for dodgr
, because it's just too slow otherwise, but anglr
will likely need better precision, right?
... on the other hand, this will likely change a lot with C++20, and extensions of to_chars
. I've tried wrapping these, but the return value is essentially a std::string_view
object, which does not play nicely with std::string
, which is the simplest thing to Rcpp::wrap
. Solving this would need a custom interface between C++17 and the C-level CHARSXP stuff.
@dcooley Have you tried playing with std::to_chars()
perchance?
I just always thought that Rmpfr or CGAL would be required for the hard-triangulation-robustness stuff - I did not ever realize that rounding per se was as current.
FWIW the stuff I will cross-reference as I follow along:
https://github.com/danshapero/predicates
https://observablehq.com/@mourner/non-robust-arithmetic-as-art
Have you tried playing with std::to_chars() perchance?
No, I've tried to stay within C++11 requirements (sometimes to C++14), so haven't used this.
I've been going through how data.table
do their comparisons to see if I can adapt their code. But haven't got far with it.
These files choke at pslg:
SSRUs_East_Ant_Polys.shp
cst10_polygon.shp
Is the dedupe failing?