ycphs / openxlsx

openxlsx - a fast way to read and write complex xslx files
https://ycphs.github.io/openxlsx/
Other
226 stars 75 forks source link

Feature Request: Merge repeated cells by one or more variables #341

Closed darentsai closed 5 months ago

darentsai commented 2 years ago

When data have one or more categorical variables, it's common to sort them and merge repeated cells together when creating an excel file. The argument mergeCols added in writeData() controls which columns need to be merged. Its order indicates how your data is grouped and sorted. An example using esoph data is demonstrated as follows:

df <- with(esoph, esoph[order(agegp, alcgp, tobgp), ])[1:30, ]
wb <- createWorkbook()
addWorksheet(wb, "esoph")
writeData(wb, "esoph", df, borders = "all", mergeCols = 1:3)
addStyle(wb, "esoph", createStyle(halign = "center", valign = "center"),
         rows = 1:(nrow(df)+1), cols = 1:ncol(df), gridExpand = TRUE, stack = TRUE)
saveWorkbook(wb, "writeDataExample.xlsx", overwrite = TRUE)
example
JanMarvin commented 2 years ago

Hi @darentsai , lets begin with "it's common": I have never seen someone doing something like this and I've seen people doing many strange things. Therefore I have do disagree that it's common ;)

Otherwise it looks good. Just a suggestion, maybe the sorting should also be done via writeData? So that we keep the input and output in sync. Are there any drawbacks that I am overlooking?

Final remarks:

codecov-commenter commented 2 years ago

Codecov Report

Merging #341 (8286fca) into master (2ace59a) will decrease coverage by 0.08%. The diff coverage is 7.69%.

@@            Coverage Diff             @@
##           master     #341      +/-   ##
==========================================
- Coverage   67.76%   67.67%   -0.09%     
==========================================
  Files          34       34              
  Lines        8931     8944      +13     
==========================================
+ Hits         6052     6053       +1     
- Misses       2879     2891      +12     
Impacted Files Coverage Δ
R/build_workbook.R 97.36% <ø> (ø)
R/writeData.R 82.90% <7.69%> (-5.44%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 3ceeb84...8286fca. Read the comment docs.

darentsai commented 2 years ago

@JanMarvin Thanks for your kind reply. My statements may make some misunderstanding. If the data written to an excel file is used for further analysis or modeling, this feature is not practical. But, if the data have been grouped, summarised by some categorical variables and going to be presented in a report, this feature can make the appearance of table much tidier.(I update a new example in my previous comment using another dataset.)

As for the sorting process, I also thought about doing it in writeData. In my opinion, a method that prints data to an external file should not change the input structure (e.g. order). Hence, I presume users have arranged data as they want before writing it out. mergeCols only works on how the data is rendered to excel.

JanMarvin commented 2 years ago

Alright, I agree on the sorting part. Give @ycphs or me a ping when to review (still like to see a test, even if it's just a tiny expect_silent(write...)).

JanMarvin commented 2 years ago

Hi @darentsai , what is the state of this? Ready and just missing a test or did you find some larger issue that is holding this pull request back?

darentsai commented 2 years ago

@JanMarvin Thanks for the reminder. I have tried several kinds of data and adjusted different sets of arguments. I don't find anything wrong so far. I think it's ready to be used but lacks for a test. I'm sorry that I'm not familiar with the test mechanism of package development, so I haven't figured out a proper test example to examine this new feature.

github-actions[bot] commented 10 months ago

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.