ropensci / geojsonio

Convert many data formats to & from GeoJSON & TopoJSON
https://docs.ropensci.org/geojsonio
Other
150 stars 59 forks source link

replace NULL value (empty list) by NA #200

Open jdlom opened 1 year ago

jdlom commented 1 year ago

issue #199

mikemahoney218 commented 1 year ago

The main challenge with this approach is that this breaks nested lists:

tibble::tibble(x = c(list(NULL), 1, list(c("a", "b"))), latitude = 42, longitude = 76) |> 
      geojsonio::geojson_list() |> 
      geojsonio::geojson_sf()
#> Assuming 'longitude' and 'latitude' are longitude and latitude, respectively
#> Error in `[[<-.data.frame`(`*tmp*`, nm.lc[i], value = list(character(0),  : 
#>   replacement has 3 rows, data has 0

I think in the best case scenario, you wouldn't have NULLs inside list columns in the first place; I'm hoping to find time to look at your context in the issue to see if httr can be used to get out of this situation, so you don't wind up with such a weird tibble (I've never seen a NULL inside a list column in a tibble before! I can only make a standard data.frame do that via list2DF). Failing that, I think we might be able to spot-convert NULL items in lists to NA items instead (at least for list elements which aren't themselves lists).

tl;dr I'm worried about fixing NULLs inside of list columns specifically, rather than thinking about how we handle list columns more holistically.

(edit to add: None of the revdeps appear to test using nested lists here:

── CHECK ────────────────────────────────────────────────────────────────────── 15 packages ──
✔ leaftime 0.2.0                         ── E: 0     | W: 0     | N: 1                        
✔ csdata 2022.11.22                      ── E: 0     | W: 0     | N: 0                        
✔ highcharter 0.9.4                      ── E: 0     | W: 0     | N: 1                        
✔ bRacatus 1.0.11                        ── E: 0     | W: 0     | N: 1                        
✔ antaresViz 0.17.1                      ── E: 0     | W: 0     | N: 0                        
✔ mregions 0.1.8                         ── E: 0     | W: 0     | N: 1                        
✔ MODIStsp 2.0.9                         ── E: 0     | W: 0     | N: 0                        
✔ mapping 1.3                            ── E: 0     | W: 0     | N: 1                        
✔ rgee 1.1.5                             ── E: 0     | W: 0     | N: 2                        
✔ rmapzen 0.4.4                          ── E: 0     | W: 0     | N: 1                        
✔ StratifiedSampling 0.4.1               ── E: 1     | W: 0     | N: 0                        
✔ valhallr 0.1.0                         ── E: 0     | W: 0     | N: 2                        
✔ spectralR 0.1.2                        ── E: 0     | W: 0     | N: 0                        
✔ webglobe 1.0.3                         ── E: 0     | W: 0     | N: 1                        
✔ sen2r 1.5.4                            ── E: 0     | W: 0     | N: 0                        
OK: 15                                                                                      
BROKEN: 0

~Though I didn't look into that error in StratifiedSampling, so might be related~ not related).

jdlom commented 1 year ago

The main challenge with this approach is that this breaks nested lists:

tibble::tibble(x = c(list(NULL), 1, list(c("a", "b"))), latitude = 42, longitude = 76) |> 
      geojsonio::geojson_list() |> 
      geojsonio::geojson_sf()
#> Assuming 'longitude' and 'latitude' are longitude and latitude, respectively
#> Error in `[[<-.data.frame`(`*tmp*`, nm.lc[i], value = list(character(0),  : 
#>   replacement has 3 rows, data has 0

I don't understand, it's supposed to work ? I got the same error with the last version (commit 321c679)

mikemahoney218 commented 1 year ago

With the current version of geojsonio:

> tibble::tibble(x = c(list(NULL), 1, list(c("a", "b"))), latitude = 42, longitude = 76) |> 
+     geojsonio::geojson_list() |> 
+     geojsonio::geojson_sf()
Registered S3 method overwritten by 'geojsonsf':
  method        from   
  print.geojson geojson
Assuming 'longitude' and 'latitude' are longitude and latitude, respectively
Simple feature collection with 3 features and 1 field
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 76 ymin: 42 xmax: 76 ymax: 42
Geodetic CRS:  WGS 84
             x      geometry
1          { } POINT (76 42)
2            1 POINT (76 42)
3 [ "a", "b" ] POINT (76 42)
jdlom commented 1 year ago

That's strange. I have reinstalled the package and I got almost the same error.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

tibble::tibble(x = c(list(NULL), 1, list(c("a", "b"))), latitude = 42, longitude = 76) |> 
geojsonio::geojson_list() |> 
geojsonio::geojson_sf()
#> Registered S3 method overwritten by 'geojsonsf':
#>   method        from   
#>   print.geojson geojson
#> Assuming 'longitude' and 'latitude' are longitude and latitude, respectively
#> Error in `[[<-.data.frame`(`*tmp*`, nm.lc[i], value = list("{ }", "1", : replacement has 3 rows, data has 0

packageVersion("geojsonio")
#> [1] '0.11.0.9000'
packageVersion("jsonlite")
#> [1] '1.8.4'
packageVersion("sf")
#> [1] '1.0.12'