Closed sbihorel closed 4 years ago
Sorry @hughjonesd, I don't know how to do pull request, so I posted my suggested code as a feature issue.
Hey, that's cool and high effort. Thank you.
Doing a pull request is going to make it much easier to test, review and (perhaps!) integrate this, and it's a useful skill, so I'll tell you how.
Next up, I'll look at this code and give you my thoughts.
So my first question is, what's the use case? When did you need this function? Who else would need it?
The use case for merge_repeated_rows()
is that often, a particular column reflects some grouping data that could be represented in a huxtable by a single piece of text. For example, look at the Species
column in iris
.
I can see something similar being true across a particular row, but it seems rarer. For most data, especially if it's in "tidy" format, different columns represent fundamentally different pieces of data. The exception is data in wide format where you might have e.g. "rank in week 1", "rank in week 2", etc. Though there again, it's not clear you'd often want to merge across those columns – it would be a very specific use case.
This is something more general again. I wonder how often people will have a "square" of data that they want to turn into a single cell.
My second comment is, the function is long and complex. I can see it being hard to maintain... and huxtable has a lot of hard-to-maintain code already, written by yours truly :-) ....
Overall then, I'm not convinced of this. But I do appreciate the effort. A simpler merge_repeated_cols
(or columns
? Hmm...) might have a better chance of getting in. Another project I'm thinking about is something to take a "grouping" column and turn it into a header row running across the top of the group. (I think the gt
package has something similar.)
A few minor points:
assertthat
to check arguments on your way in.rows
and columns
should accept arguments in huxtable style, see merge_repeated_rows
for one way to do this.rows
or cols
are outside the range of the huxtable, I'd throw an error rather than trying to guess the user's intention. get_rc_spec
will do that for you, again, see merge_repeated_rows
.Hi David,
Thanks for the step-by-step instructions on github pull requests. I will look into it.
Regarding your questions:
minor points:
merging by column or row first matters when the table contains something like the following (see A's): X1 X2 X3 A A B A C D
merging by column first would return: X1 X2 X3 A B A C D
merging by row first would return: X1 X2 X3 A A B C D
Cheers
Sebastien
On Sun, Jun 7, 2020 at 1:48 AM David Hugh-Jones notifications@github.com wrote:
So my first question is, what's the use case? When did you need this function? Who else would need it?
The use case for merge_repeated_rows() is that often, a particular column reflects some grouping data that could be represented in a huxtable by a single piece of text. For example, look at the Species column in iris.
I can see something similar being true across a particular row, but it seems rarer. For most data, especially if it's in "tidy" format, different columns represent fundamentally different pieces of data. The exception is data in wide format where you might have e.g. "rank in week 1", "rank in week 2", etc. Though there again, it's not clear you'd often want to merge across those columns – it would be a very specific use case.
This is something more general again. I wonder how often people will have a "square" of data that they want to turn into a single cell.
My second comment is, the function is long and complex. I can see it being hard to maintain... and huxtable has a lot of hard-to-maintain code already, written by yours truly :-) ....
Overall then, I'm not convinced of this. But I do appreciate the effort. A simpler merge_repeated_cols (or columns? Hmm...) might have a better chance of getting in. Another project I'm thinking about is something to take a "grouping" column and turn it into a header row running across the top of the group. (I think the gt package has something similar.)
A few minor points:
- use assertthat to check arguments on your way in.
- rows and columns should accept arguments in huxtable style, see merge_repeated_rows for one way to do this.
- Why would it matter whether you merge columns or rows first?
- If rows or cols are outside the range of the huxtable, I'd throw an error rather than trying to guess the user's intention. get_rc_spec will do that for you, again, see merge_repeated_rows.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/hughjonesd/huxtable/issues/156#issuecomment-640161543, or unsubscribe https://github.com/notifications/unsubscribe-auth/AETTNP6CK4O3C4DTGTDGT3DRVMS2LANCNFSM4NV7LPCA .
Hi David,
Correct me if I am wrong but I don't think that get_rc_specs errors out when coordinates are outside the dimension of the huxtable:
ht <- hux( a = 1:2, b = 3:4 ) rows <- 4:8 cols <- 3:5
huxtable:::get_rc_spec(ht, rows, 1)
huxtable:::get_rc_spec(ht, cols, 1)
Executed example:
ht <- hux(
- a = 1:2,
- b = 3:4
- ) rows <- 4:8 cols <- 3:5
Still returns 4:8
huxtable:::get_rc_spec(ht, rows, 1) [1] 4 5 6 7 8
Still returns 3:5
huxtable:::get_rc_spec(ht, cols, 1) [1] 3 4 5 sessionInfo() R version 3.6.3 (2020-02-29) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Linux Mint 19.3
Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] jsonlite_1.6.1 huxtable_4.7.1
loaded via a namespace (and not attached): [1] zip_2.0.4 Rcpp_1.0.4.6 pillar_1.4.3 compiler_3.6.3 base64enc_0.1-3 tools_3.6.3 digest_0.6.25 uuid_0.1-4 evaluate_0.14 lifecycle_0.2.0 [11] tibble_3.0.1 lattice_0.20-41 pkgconfig_2.0.3 rlang_0.4.6 rstudioapi_0.11 xfun_0.13 officer_0.3.11 dplyr_0.8.5 stringr_1.4.0 knitr_1.28 [21] xml2_1.3.2 generics_0.0.2 gdtools_0.2.2 vctrs_0.2.4 systemfonts_0.2.2 grid_3.6.3 tidyselect_1.0.0 glue_1.4.0 data.table_1.12.8 R6_2.4.1 [31] rmarkdown_2.1 purrr_0.3.4 magrittr_1.5 ellipsis_0.3.0 htmltools_0.4.0 assertthat_0.2.1 flextable_0.5.10 stringi_1.4.6 crayon_1.3.4 zoo_1.8-8
On Sun, Jun 7, 2020 at 1:48 AM David Hugh-Jones notifications@github.com wrote:
So my first question is, what's the use case? When did you need this function? Who else would need it?
The use case for merge_repeated_rows() is that often, a particular column reflects some grouping data that could be represented in a huxtable by a single piece of text. For example, look at the Species column in iris.
I can see something similar being true across a particular row, but it seems rarer. For most data, especially if it's in "tidy" format, different columns represent fundamentally different pieces of data. The exception is data in wide format where you might have e.g. "rank in week 1", "rank in week 2", etc. Though there again, it's not clear you'd often want to merge across those columns – it would be a very specific use case.
This is something more general again. I wonder how often people will have a "square" of data that they want to turn into a single cell.
My second comment is, the function is long and complex. I can see it being hard to maintain... and huxtable has a lot of hard-to-maintain code already, written by yours truly :-) ....
Overall then, I'm not convinced of this. But I do appreciate the effort. A simpler merge_repeated_cols (or columns? Hmm...) might have a better chance of getting in. Another project I'm thinking about is something to take a "grouping" column and turn it into a header row running across the top of the group. (I think the gt package has something similar.)
A few minor points:
- use assertthat to check arguments on your way in.
- rows and columns should accept arguments in huxtable style, see merge_repeated_rows for one way to do this.
- Why would it matter whether you merge columns or rows first?
- If rows or cols are outside the range of the huxtable, I'd throw an error rather than trying to guess the user's intention. get_rc_spec will do that for you, again, see merge_repeated_rows.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/hughjonesd/huxtable/issues/156#issuecomment-640161543, or unsubscribe https://github.com/notifications/unsubscribe-auth/AETTNP6CK4O3C4DTGTDGT3DRVMS2LANCNFSM4NV7LPCA .
I understand. I think this probably isn't useful for most guys. Having seen your example, I wonder whether running merge_repeated_rows()
followed by merge_repeated_columns()
(written along the same lines) would do what you need. You'd want to avoid huxtable complaining about overwriting existing multicolumn cells, which might be tricky. Incidentally, if you want to play this, t(merge_repeated_rows(t(hux)))
is a quick hack for merge_repeated_columns()
.
Thank you for that last point. I wonder if I should fix it. Presumably errors are gonna turn up somewhere....
Describe what is missing huxtable does not provide any function to automatically merge repeated cells by column and/or row.
Describe your proposed solution See code below for new proposed function merge_repeated_cells
Target audience All users
Additional context