Open jeremystan opened 10 years ago
First pass at this was difficult... a function factory may be too complex or abstract. Perhaps can be broken into smaller functions?
we should also bundle a gather_values
or gather_items
feature into this refactor. Currently, we don't have an elegant way to treat the values of a large JSON dictionary as an array, and input json data that doesn't start as an array is not valuable for us.
For example, it's currently difficult to extract color from this structure
{"a": {"color": "blue"},
"b": {"color": "red"},
"c": {"color": "blue"}
}
without using lapply like this:
json %>% gather_keys,
function(key) {
json %>% spread_values( color = jstring(key, "color") )
}) %>%
rbind_all
It would be nicer to do
json %>% gather_values %>% spread_values("color")
# or
json %>% gather_items %>% spread_values("color") ## for gather_items, I guess we could create a column called "key.1" or something
To extract color from this JSON:
json <- '{"a": {"color": "blue"},
"b": {"color": "red"},
"c": {"color": "blue"}
}'
the following works:
json %>% as.tbl_json %>% gather_keys("letter") %>% spread_values(color = jstring("color"))
document.id letter color
1 1 a blue
2 1 b red
3 1 c blue
gather_keys and gather_array have very similar formats. Can they be unified in some way that simplifies the code?