ropensci / geojsonio

Convert many data formats to & from GeoJSON & TopoJSON
https://docs.ropensci.org/geojsonio
Other
151 stars 59 forks source link

NULL value from geojson_list is convert to `{ }` when using `geojsonio::geojson_sf` #199

Open jdlom opened 1 year ago

jdlom commented 1 year ago
library(dplyr, warn.conflicts = FALSE)
#> Warning: le package 'dplyr' a été compilé avec la version R 4.2.3

data <- tibble::tribble(
  ~x, ~y,  ~latitude, ~longitude,
  "a", NULL,   49.8, 0.48,
  NULL, 1,  48.7, 0.55
)

geojsonio::geojson_list(input = data) %>%
  geojsonio::geojson_sf()
#> Registered S3 method overwritten by 'geojsonsf':
#>   method        from   
#>   print.geojson geojson
#> Assuming 'longitude' and 'latitude' are longitude and latitude, respectively
#> Simple feature collection with 2 features and 2 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: 0.48 ymin: 48.7 xmax: 0.55 ymax: 49.8
#> Geodetic CRS:  WGS 84
#>     x   y          geometry
#> 1   a { } POINT (0.48 49.8)
#> 2 { }   1 POINT (0.55 48.7)

packageVersion("geojsonio")
#> [1] '0.10.0'
mikemahoney218 commented 1 year ago

Thanks for the report, @jdlom ! Highly appreciated.

Can you give me a bit of description about where this came up, and what you'd expect to happen instead?

Without digging deeper, I think the core issue is that geojsonio isn't really built to handle list columns as inputs, and without list columns you'd never have NULL in a data.frame. Strikes me that the behavior when the list is of length > 1 is also not optimal:

data <- tibble::tribble(
  ~x, ~y,  ~latitude, ~longitude,
  c("a, b"), NULL,   49.8, 0.48,
  NULL, 1,  48.7, 0.55
)

class(data$x)
#> [1] "list"

# Weird for x to be concatenated like that:
geojsonio::geojson_list(input = data) |> 
  geojsonio::geojson_sf()
#> Registered S3 method overwritten by 'geojsonsf':
#>   method        from   
#>   print.geojson geojson
#> Assuming 'longitude' and 'latitude' are longitude and latitude, respectively
#> Simple feature collection with 2 features and 2 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: 0.48 ymin: 48.7 xmax: 0.55 ymax: 49.8
#> Geodetic CRS:  WGS 84
#>      x   y          geometry
#> 1 a, b { } POINT (0.48 49.8)
#> 2  { }   1 POINT (0.55 48.7)

Created on 2023-04-19 with reprex v2.0.2

So my question is: where does this issue come up for you, and are there ways to avoid having list columns in your input? (As a workaround; I'd agree the current behavior isn't optimal).

jdlom commented 1 year ago

I would prefer to have NAfor NULLvalue even if I don't know if it's good idea. Not a specialist :)

I fetch a geojson from api (multiple call). To use geojson_sf, I give the geojson_listclass to my list and then I use geojson_sf. Unfortunately, there are NULL values.

I've just checked, and jsonlite::toJSON accept a null argument. It's "list" as default.

library(dplyr, warn.conflicts = FALSE)
#> Warning: le package 'dplyr' a été compilé avec la version R 4.2.3

data <- tibble::tribble(
  ~x, ~y,  ~latitude, ~longitude,
  "a", NULL,   49.8, 0.48,
  NULL, 1,  48.7, 0.55
)

my_list <- geojsonio::geojson_list(input = data)
#> Registered S3 method overwritten by 'geojsonsf':
#>   method        from   
#>   print.geojson geojson
#> Assuming 'longitude' and 'latitude' are longitude and latitude, respectively

my_list <- geojsonio::geojson_list(input = data)
#> Assuming 'longitude' and 'latitude' are longitude and latitude, respectively
geojsonio:::tosf(geojsonio:::to_json(my_list, null = "null"), stringsAsFactors = FALSE)
#> Simple feature collection with 2 features and 2 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: 0.48 ymin: 48.7 xmax: 0.55 ymax: 49.8
#> Geodetic CRS:  WGS 84
#>      x  y          geometry
#> 1    a NA POINT (0.48 49.8)
#> 2 <NA>  1 POINT (0.55 48.7)

Maybe, you can add this :

#' @export
as.json.geo_list <- function(x, ...) to_json(unclass(x), null = "null", ...)
jdlom commented 1 year ago

I just made a PR,. You can close it if you think it's not a good idea.

jdlom commented 1 year ago

For the context :

library(dplyr, quietly = TRUE, warn.conflicts = FALSE)
url <- "https://hubeau.eaufrance.fr/api/v1/hydrometrie/referentiel/sites?code_departement=76&code_site=G0170420&format=geojson&size=20"
res <- httr::GET(url)
l <- httr::content(res, "parsed")
class(l) <- "geo_list"
df <- l %>% geojsonio::geojson_sf()
#> Registered S3 method overwritten by 'geojsonsf':
#>   method        from   
#>   print.geojson geojson
df
#> Simple feature collection with 1 feature and 34 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: 1.447744 ymin: 50.04607 xmax: 1.447744 ymax: 50.04607
#> Geodetic CRS:  WGS 84
#>   latitude_site premier_mois_annee_hydro_site code_cours_eau
#> 1      50.04607                             9       G01-0400
#>   influence_generale_site statut_site libelle_cours_eau surface_bv code_site
#> 1                     { }           1         La Bresle        693  G0170420
#>   code_zone_hydro_site coordonnee_x_site code_systeme_alti_site
#> 1                 G017            588714                      3
#>   code_troncon_hydro_site type_site code_projection code_entite_hydro_site
#> 1                G0170400      REEL              26               G01-0400
#>   coordonnee_y_site                libelle_site date_premiere_donnee_dispo_site
#> 1           6995278 La Bresle à Ponts-et-Marais                             { }
#>   commentaire_influence_generale_site longitude_site commentaire_site
#> 1                                 { }       1.447744              { }
#>   date_maj_site premier_mois_etiage_site grandeur_hydro altitude_site
#> 1    2021-11-26                        1              Q             7
#>                         uri_cours_eau code_region type_contexte_loi_stat_site
#> 1 http://id.eaufrance.fr/CEA/G01-0400          28                           1
#>   libelle_departement code_departement libelle_commune code_commune_site
#> 1      SEINE-MARITIME               76 PONTS-ET-MARAIS             76507
#>   type_loi_site libelle_region                  geometry
#> 1             2      NORMANDIE POINT (1.447744 50.04607)

I know, I could read directly with sf the url because it's a geojson, but the api is limiting and i need to use httr::GET.

If you think this is not a good idea, we could find an other way :

#' @export
geojson_sf.geo_list <- function(x, stringsAsFactors = FALSE, null = "list", ...) {
  tosf(as.json(x, null = null), stringsAsFactors = stringsAsFactors, ...)
}
library(dplyr, quietly = TRUE, warn.conflicts = FALSE)
url <- "https://hubeau.eaufrance.fr/api/v1/hydrometrie/referentiel/sites?code_departement=76&code_site=G0170420&format=geojson&size=20"
res <- httr::GET(url)
l <- httr::content(res, "parsed")
class(l) <- "geo_list"
df <- l %>% geojsonio::geojson_sf(null = "null")
#> Registered S3 method overwritten by 'geojsonsf':
#>   method        from   
#>   print.geojson geojson
df
#> Simple feature collection with 1 feature and 34 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: 1.447744 ymin: 50.04607 xmax: 1.447744 ymax: 50.04607
#> Geodetic CRS:  WGS 84
#>   latitude_site premier_mois_annee_hydro_site code_cours_eau
#> 1      50.04607                             9       G01-0400
#>   influence_generale_site statut_site libelle_cours_eau surface_bv code_site
#> 1                    <NA>           1         La Bresle        693  G0170420
#>   code_zone_hydro_site coordonnee_x_site code_systeme_alti_site
#> 1                 G017            588714                      3
#>   code_troncon_hydro_site type_site code_projection code_entite_hydro_site
#> 1                G0170400      REEL              26               G01-0400
#>   coordonnee_y_site                libelle_site date_premiere_donnee_dispo_site
#> 1           6995278 La Bresle à Ponts-et-Marais                            <NA>
#>   commentaire_influence_generale_site longitude_site commentaire_site
#> 1                                <NA>       1.447744             <NA>
#>   date_maj_site premier_mois_etiage_site grandeur_hydro altitude_site
#> 1    2021-11-26                        1              Q             7
#>                         uri_cours_eau code_region type_contexte_loi_stat_site
#> 1 http://id.eaufrance.fr/CEA/G01-0400          28                           1
#>   libelle_departement code_departement libelle_commune code_commune_site
#> 1      SEINE-MARITIME               76 PONTS-ET-MARAIS             76507
#>   type_loi_site libelle_region                  geometry
#> 1             2      NORMANDIE POINT (1.447744 50.04607)