ropensci / geojsonio

Convert many data formats to & from GeoJSON & TopoJSON
https://docs.ropensci.org/geojsonio
Other
150 stars 59 forks source link

Digits with `geojson_list()` #141

Open josiekre opened 6 years ago

josiekre commented 6 years ago

There does not seem to be control over the number of digits output when passing a data frame to geojson_list(). No matter how many significant digits there are in the lon and lat columns, the resulting list has at most five digits to the right of the decimal, e.g. -77.11739, 38.94512.

I see from #96 that digits were thought about at some point, but I'm not seeing how this translates into geojson_list(). What am I missing?

Thanks.

Session Info ```r > devtools::session_info() Session info ----------------------------------------------------------------------------------- setting value version R version 3.4.2 (2017-09-28) system x86_64, darwin15.6.0 ui RStudio (1.1.383) language (EN) collate en_US.UTF-8 tz Europe/Copenhagen date 2018-08-20 Packages --------------------------------------------------------------------------------------- package * version date source assertthat 0.2.0 2017-04-11 CRAN (R 3.4.0) base * 3.4.2 2017-10-04 local bindr 0.1.1 2018-03-13 cran (@0.1.1) bindrcpp * 0.2.2 2018-03-29 cran (@0.2.2) citycastgtn * 0.0.0.9000 local cli 1.0.0 2017-11-05 CRAN (R 3.4.2) commonmark 1.4 2017-09-01 CRAN (R 3.4.1) compiler 3.4.2 2017-10-04 local crayon 1.3.4 2017-09-16 CRAN (R 3.4.1) curl 3.0 2017-10-06 CRAN (R 3.4.2) datasets * 3.4.2 2017-10-04 local devtools 1.13.4 2017-11-09 CRAN (R 3.4.2) digest 0.6.15 2018-01-28 cran (@0.6.15) dplyr 0.7.6 2018-06-29 cran (@0.7.6) foreign 0.8-69 2017-06-22 CRAN (R 3.4.2) geojson 0.2.0 2017-11-08 CRAN (R 3.4.2) geojsonio 0.5.0.9100 2018-03-22 Github (ropensci/geojsonio@5c39fe7) glue 1.3.0 2018-07-17 cran (@1.3.0) graphics * 3.4.2 2017-10-04 local grDevices * 3.4.2 2017-10-04 local grid 3.4.2 2017-10-04 local hms 0.3 2016-11-22 CRAN (R 3.4.0) httr 1.3.1 2017-08-20 CRAN (R 3.4.1) jqr 1.0.0 2017-09-28 CRAN (R 3.4.2) jsonlite 1.5 2017-06-01 CRAN (R 3.4.0) jsonvalidate 1.0.0 2016-06-13 CRAN (R 3.4.0) lattice 0.20-35 2017-03-25 CRAN (R 3.4.2) lazyeval 0.2.1 2017-10-29 CRAN (R 3.4.2) lubridate 1.7.4 2018-04-11 CRAN (R 3.4.4) magrittr * 1.5 2014-11-22 CRAN (R 3.4.0) maptools 0.9-2 2017-03-25 cran (@0.9-2) memoise 1.1.0 2017-04-21 CRAN (R 3.4.0) methods * 3.4.2 2017-10-04 local pillar 1.1.0 2018-01-14 cran (@1.1.0) pkgconfig 2.0.1 2017-03-21 CRAN (R 3.4.0) purrr 0.2.4 2017-10-18 CRAN (R 3.4.2) R6 2.2.2 2017-06-17 CRAN (R 3.4.0) Rcpp 0.12.18 2018-07-23 cran (@0.12.18) readr 1.1.1 2017-05-16 CRAN (R 3.4.0) rgdal 1.2-18 2018-03-17 cran (@1.2-18) rgeos 0.3-26 2017-10-31 cran (@0.3-26) RJSONIO 1.3-0 2014-07-28 CRAN (R 3.4.0) rlang 0.2.1.9000 2018-07-30 Github (r-lib/rlang@d97e73d) roxygen2 6.0.1 2017-02-06 CRAN (R 3.4.0) rstudioapi 0.7 2017-09-07 CRAN (R 3.4.1) sp 1.3-1 2018-06-05 cran (@1.3-1) stats * 3.4.2 2017-10-04 local stringi 1.2.2 2018-05-02 cran (@1.2.2) stringr 1.3.1 2018-05-10 cran (@1.3.1) tibble 1.4.2 2018-01-22 CRAN (R 3.4.3) tidyr 0.8.0 2018-01-29 cran (@0.8.0) tidyselect 0.2.4 2018-02-26 cran (@0.2.4) tools 3.4.2 2017-10-04 local utf8 1.1.3 2018-01-03 cran (@1.1.3) utils * 3.4.2 2017-10-04 local V8 1.5 2017-04-25 CRAN (R 3.4.0) withr 2.1.2 2018-05-02 Github (jimhester/withr@79d7b0d) xml2 1.2.0 2018-01-24 cran (@1.2.0) yaml 2.1.19 2018-05-01 cran (@2.1.19) ```
josiekre commented 6 years ago

Here is a simple example:

> (s <- dplyr::data_frame(
      id = c("A", "B"),
      lat = c(38.949019, 39.008222),
      lon = c(-77.080369, -76.780363)
  ))
# A tibble: 2 x 3
  id      lat   lon
  <chr> <dbl> <dbl>
1 A      38.9 -77.1
2 B      39.0 -76.8

Let's print out the lon column from the data frame to make sure all the digits we put in are still there. We are looking for 6 digits after the decimal and 6 come out:

> format(s$lon, digits = 10)
[1] "-77.080369" "-76.780363"

Now we'll convert it.

> (g <- geojsonio::geojson_list(s, lat = "lat", lon = "lon"))
$type
[1] "FeatureCollection"

$features
$features[[1]]
$features[[1]]$type
[1] "Feature"

$features[[1]]$geometry
$features[[1]]$geometry$type
[1] "Point"

$features[[1]]$geometry$coordinates
[1] -77.08037  38.94902

$features[[1]]$properties
$features[[1]]$properties$id
[1] "A"

$features[[2]]
$features[[2]]$type
[1] "Feature"

$features[[2]]$geometry
$features[[2]]$geometry$type
[1] "Point"

$features[[2]]$geometry$coordinates
[1] -76.78036  39.00822

$features[[2]]$properties
$features[[2]]$properties$id
[1] "B"

attr(,"class")
[1] "geo_list"
attr(,"from")
[1] "data.frame"

Now when we look again, we have lost a digit (5 instead of 6):

> is(g$features[[1]]$geometry$coordinates)
[1] "numeric" "vector" 
> format(g$features[[1]]$geometry$coordinates, digits = 10)
[1] "-77.08037" " 38.94902"
sckott commented 6 years ago

thanks @josiekre will have a look

sckott commented 6 years ago

one thing is that you're on an older dev version, we're currently on 0.6.0.9100, though I don't think that affects the issue at hand.

have you played with the digits option in R? you can get it by getOption('digits') and set it by options(digits = 8)

options(digits = 8)
s <- dplyr::data_frame(
  id = c("A", "B"),
  lat = c(38.949019, 39.008222),
  lon = c(-77.080369, -76.780363)
)
(g <- geojsonio::geojson_list(input = s, lat = "lat", lon = "lon"))
lapply(g$features, "[[", c("geometry", "coordinates"))
#> [[1]]
#> [1] -77.080369  38.949019
#> 
#> [[2]]
#> [1] -76.780363  39.008222

let me know what you think

josiekre commented 6 years ago

I thought I played with that. In the file output (as opposed to interactive output), I don't think it made a difference. But I will rerun and report back in a couple days. Thx @sckott.

josiekre commented 6 years ago

A call to options(digits = 8) does impact the writing out, but I cannot figure out the relationship between that and digits in the call to geojson_list(). Can you elaborate on that?

sckott commented 6 years ago

Sorry for the delay @josiekre - was on vacation.

The number of digits written to the console depends on the R option digits, so you can set that outside of the geojsonio package yourself and it affects what geojson_list() returns. Does that make sense?

sckott commented 6 years ago

any thoughts @josiekre ?

josiekre commented 6 years ago

I personally think it would be nicer to control the output digits within the function call. It would be better to create a controlled, repeatable function call. I think of options() as things that impact my current R session interactively.

It could be smart to default like this:

geojson_list(..., digits = options('digits')$digits)
sckott commented 6 years ago

thanks @josiekre for the suggestion. i'll think about that.

sckott commented 5 years ago

@josiekre I've experimented with this. there's no way I can see to allow the user to change digits in the outupt, AND not effect the global digits option.

You can format numbers directly with e.g., format, sprintf, etc. but we can't feasibly do that with the complex nested lists, etc. we're dealing with.

I think the best option is to document that if you want to change digits, set them with options(digits = x). Does that sound okay?

josiekre commented 5 years ago

The jsonlite::toJSON() function has a digits option that works in this manner. Have you seen this?

sckott commented 5 years ago

thanks @josiekre yes, i have seen that, but did forget about it. In geojson_list some functions do go through jsonlite but some do not.

@jeroen is there a way to use here the approach you have for controlling digits in jsonlite? looks like you drop down to C as far as I can tell

sckott commented 5 years ago

@jeroen any thoughts on this ☝️ ?

ChrisJones687 commented 5 years ago

@sckott I might have a simple solution by updating geojson_rw to take precision as a variable since it is used in geojson_write which is called by geojson_rw. I already have the code updated and it works for my use case. I can submit a PR if you think it would be useful for others and not break anything anywhere else. In my testing, it hasn't affected anything else that I have noticed.

sckott commented 5 years ago

thanks @ChrisJones687 ! a PR would be great.

sckott commented 5 years ago

So the PR #152 added ability to manipulate digits for sp class objects, but we don't have a solution for data.frame's, vectors, lists, etc.

I tried using sf internally with data.frames just to see if it would be feasible, but its much slower than what we have now. Oh, and I didn't even check that we can manipulate precision through this route, but I assume we can. Anyway ...

geojson_list_data.frame <- function(input, lat = NULL, lon = NULL, group = NULL,
                                    geometry = "point", type = "FeatureCollection", ...) {

  tmp <- guess_latlon(names(input), lat, lon)
  out <- list()
  for (i in seq_len(NROW(input))) {
    out[[i]] <- sf::st_point(as.numeric(c(input[i, tmp$lat], input[i, tmp$lon])))
  }
  tfile <- tempfile(fileext = ".geojson")
  tmpsf <- sf::st_as_sf(input, coords = c("lat", "long"))
  sf::st_write(tmpsf, tfile, quiet = TRUE)
  xx <- as.geo_list(jsonlite::fromJSON(readLines(tfile), TRUE, FALSE, FALSE),
    "data.frame")
  xx$features <- lapply(xx$features, function(z) {
    z$geometry$coordinates <- rev(as.numeric(z$geometry$coordinates))
    z
  })
  xx$name <- NULL
  return(xx)
}