Open ramiromagno opened 3 years ago
Thanks for reporting!! That is surprising behavior indeed. I'll take a look and see if this can be improved. To be fair, ..JSON
is actually always a list, but it is a "hidden" column, so we just print a character overview of what it looks like in JSON form.
I suspect the problem that is happening here is that the "grouped data frame" is no longer a tbl_json
. Do you plan to do anything else with the JSON data after doing the group_by
operations? It'd be good to get a sense for the workflow you are shooting for here!
Here's some more context:
#' @importFrom rlang .data
collect_samples <- function(tbl_json) {
samples_variants <- tbl_json %>%
tidyjson::enter_object('samples_variants') %>%
tidyjson::gather_array(column.name = 'sample_id') %>%
dplyr::select(-'sample_id')
samples_training <- tbl_json %>%
tidyjson::enter_object('samples_training') %>%
tidyjson::gather_array(column.name = 'sample_id') %>%
dplyr::select(-'sample_id')
all_samples <-
tidyjson::bind_rows(samples_variants, samples_training) %>%
dplyr::group_by(.data$..page, .data$array.index) %>%
dplyr::mutate(., sample_id = seq_len(dplyr::n()), .after = 'array.index') %>%
dplyr::arrange('sample_id', .by_group = TRUE) %>%
dplyr::ungroup() %>%
tidyjson::as.tbl_json(json.column = '..JSON') # Needed because of https://github.com/colearendt/tidyjson/issues/135.
return(all_samples)
}
I found that using tidyjson::as.tbl_json(json.column = '..JSON')
works as a workaround.
That makes a lot of sense! Thank you for the context!
Grouped mutates are perfect. One of the concerns we have for supporting grouped tibbles is "summarize" operations, which will necessarily destroy the JSON data and make future tidyjson operations largely meaningless. It'd be great if we could find a nice middle way that supports grouped tibbles for certain operations, or at least makes the confusing state here more clear / understandable. On Feb 6 2021, at 10:28 am, Ramiro Magno notifications@github.com wrote:
Here's some more context:
' @importFrom rlang .datacollect_samples <- function(tbl_json) {
samples_variants <- tbl_json %>% tidyjson::enter_object('samples_variants') %>% tidyjson::gather_array(column.name = 'sample_id') %>% dplyr::select(-'sample_id')
samples_training <- tbl_json %>% tidyjson::enter_object('samples_training') %>% tidyjson::gather_array(column.name = 'sample_id') %>% dplyr::select(-'sample_id')
all_samples <- tidyjson::bind_rows(samples_variants, samples_training) %>% dplyr::group_by(.data$..page, .data$array.index) %>% dplyr::mutate(., sample_id = seq_len(dplyr::n()), .after = 'array.index') %>% dplyr::arrange('sample_id', .by_group = TRUE) %>% dplyr::ungroup() %>% tidyjson::as.tbl_json(json.column = '..JSON') # Needed because of https://github.com/colearendt/tidyjson/issues/135.
return(all_samples) } I found that using tidyjson::as.tbl_json(json.column = '..JSON') works as a workaround.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub (https://github.com/colearendt/tidyjson/issues/135#issuecomment-774494019), or unsubscribe (https://github.com/notifications/unsubscribe-auth/AFQBVVWYAPFBVYSJ25RY6HDS5VNZHANCNFSM4XF2JK5A).
btw, I should probably create a separate issue, but something I see doing myself too often is drop a recently created index column with gather_array
, like in the example above:
tidyjson::gather_array(column.name = 'sample_id') %>%
dplyr::select(-'sample_id')
you think it would be possible to allow that argument column.name
to accept a special value that would automatically drop the index column?
Perhaps?
tidyjson::gather_array(column.name = NULL)
Yeah, I like that idea!
Sent from my iPhone
On Feb 6, 2021, at 10:41 AM, Ramiro Magno notifications@github.com wrote:
Perhaps?
tidyjson::gather_array(column.name = NULL)
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/colearendt/tidyjson/issues/135#issuecomment-774496056, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AFQBVVW5LATS4FY65GXZHMDS5VPLBANCNFSM4XF2JK5A.
Here
..JSON
is of typecharacter
:Now is of type
list
(scroll to the right):