colearendt / tidyjson

Tidy your JSON data in R with tidyjson
Other
182 stars 15 forks source link

Should spread_all discard scalar values from associated JSON? #77

Open jeremystan opened 8 years ago

jeremystan commented 8 years ago

Currently it leaves the JSON as is

'{"a": 1, "b": [1, 2, 3]}' %>% spread_all
#> # A tbl_json: 1 x 2 tibble with a "JSON" attribute
#>    `attr(., "JSON")` document.id     a
#>                <chr>       <int> <dbl>
#> 1 {"a":1,"b":[1,2...           1     1

Perhaps instead it should strip these away:

'{"a": 1, "b": [1, 2, 3]}' %>% spread_all
#> # A tbl_json: 1 x 2 tibble with a "JSON" attribute
#>    `attr(., "JSON")` document.id     a
#>                <chr>       <int> <dbl>
#> 1 {"b":[1,2,3]}                1     1

This makes sense since they are already captured in the tbl_json object, and it will make it easier to see that the next step should be enter_object and then gather_array.

colearendt commented 7 years ago

This could be very tricky to implement if there are multiple objects, though. I.e. when recursive=TRUE, would you strip away the values that were captured from nested objects?

My thought is that spread_values does not have this behavior, so spread_all probably should not. Or maybe both should have that behavior, but I think it makes sense for their behavior to be consistent.