ycphs / openxlsx

openxlsx - a fast way to read and write complex xslx files
https://ycphs.github.io/openxlsx/
Other
223 stars 74 forks source link

Gh issue 307 #309

Closed JanMarvin closed 1 year ago

JanMarvin commented 2 years ago

draft how to keep cols and rows as requested

ℹ Testing openxlsx ✓ | F W S OK | Context ✓ | 6 | active Sheet [1.5s]
✓ | 8 | Style Parsing [0.1s]
✓ | 8 | build_workbook [0.5s]
✓ | 4 | clone Worksheet
✓ | 11 | CommentClass
✓ | 17 | Testing 'topN' and 'bottomN' conditions in conditionalFormatting [0.2s]
✓ | 5 | Date/Time Conversions
✓ | 69 | Deleting tables from worksheets [0.3s]
✓ | 11 | Encoding Tests [0.4s]
x | 7 0 | Fill Merged Cells
──────────────────────────────────────────────────────────────────────────────────────────────────────── Failure (test-fill_merged_cells.R:27:3): fill merged cells names(read.xlsx(tmp_file, fillMergedCells = FALSE)) not equal to c("A", "B", "X3", "X4"). 2/4 mismatches x[3]: "1" y[3]: "X3"

x[4]: "2" y[4]: "X4"

Failure (test-fill_merged_cells.R:28:3): fill merged cells names(read.xlsx(tmp_file, fillMergedCells = TRUE)) not equal to c("A", "B", "B", "B"). 2/4 mismatches x[3]: "B.1" y[3]: "B"

x[4]: "B.2" y[4]: "B"

Failure (test-fill_merged_cells.R:37:3): fill merged cells read.xlsx(tmp_file, fillMergedCells = FALSE) not equal to r1. Names: 2 string mismatches Component “A”: 'is.NA' value mismatch: 0 in current 1 in target Component “B”: 'is.NA' value mismatch: 0 in current 1 in target

Failure (test-fill_merged_cells.R:38:3): fill merged cells read.xlsx(tmp_file, fillMergedCells = TRUE) not equal to r2. Names: 2 string mismatches

Failure (test-fill_merged_cells.R:40:3): fill merged cells read.xlsx(tmp_file, cols = 1:3, fillMergedCells = TRUE) not equal to r2_1. Names: 1 string mismatch

Failure (test-fill_merged_cells.R:41:3): fill merged cells read.xlsx(tmp_file, rows = 1:3, fillMergedCells = TRUE) not equal to r2[1:2, ]. Names: 2 string mismatches

Failure (test-fill_merged_cells.R:42:3): fill merged cells read.xlsx(tmp_file, cols = 1:3, rows = 1:4, fillMergedCells = TRUE) not equal to r2_1[1:3, ]. Names: 1 string mismatch ──────────────────────────────────────────────────────────────────────────────────────────────────────── ✓ | 2 | fontSizeLookupTables
✓ | 14 | Freeze Panes
✓ | 2 | getBaseFont
✓ | 5 | Check Cell Ref
✓ | 6 | Reading from workbook is identical to reading from file [0.5s]
✓ | 18 | Load Workbook Object Tables [0.1s]
✓ | 1 | Load Unzipped Workbook Object [0.4s]
x | 1 27 | Load Workbook Object [13.3s]
──────────────────────────────────────────────────────────────────────────────────────────────────────── Error (test-loading_workbook.R:952:3): Read and save file with inlineStr Error in int2col(cols): x must be numeric. Backtrace:

  1. openxlsx::readWorkbook(fl) test-loading_workbook.R:952:2
  2. openxlsx::read.xlsx.default(...) /home/jmg/Source/openxlsx/R/readWorkbook.R:89:2
  3. openxlsx::requested_frame(rows = sel_rows, cols = sel_cols, fill = TRUE) /home/jmg/Source/openxlsx/R/readWorkbook.R:373:2
  4. openxlsx::int2col(cols) /home/jmg/Source/openxlsx/R/helperFunctions.R:870:2 ──────────────────────────────────────────────────────────────────────────────────────────────────────── x | 4 72 | Named Regions [0.3s]
    ──────────────────────────────────────────────────────────────────────────────────────────────────────── Failure (test-named_regions.R:91:3): Correctly Loading Named Regions Created in Excel cell_f not equal to cell_w. Names: 1 string mismatch

Failure (test-named_regions.R:109:3): Correctly Loading Named Regions Created in Excel cell2_f not equal to cell2_w. Names: 1 string mismatch

Failure (test-named_regions.R:278:3): Missing columns in named regions dim(x) not equal to c(4, 1). 1/2 mismatches [1] 3 - 4 == -1

Failure (test-named_regions.R:286:3): Missing columns in named regions dim(x) not equal to c(4, 1). 2/2 mismatches (average diff: 1) [1] 3 - 4 == -1 [2] 2 - 1 == 1 ──────────────────────────────────────────────────────────────────────────────────────────────────────── Maximum number of failures exceeded; quitting at end of file. Increase this number with (e.g.) Sys.setenv('TESTTHAT_MAX_FAILS' = Inf)

══ Results ═════════════════════════════════════════════════════════════════════════════════════════════ Duration: 18.2 s

[ FAIL 12 | WARN 0 | SKIP 0 | PASS 286 ]

JanMarvin commented 2 years ago

the more I think about it the weirder it gets. the remove empty cols and rows feature only removes entirely empty cells from the Excel. if the Excel contains any value, even if it's converted to missing, it's treated as non empty.

therefore this draft of a pull request (currently) massively changes how openxlsx behaves.

all changes need to be duplicated with read.xlsx.workbook to match the behavior.

JanMarvin commented 2 years ago

This works now. Kinda. The logic is unclear (when does skipEmptyCols/skipEmptyRows remove cells and when not). The modifications to read.xlsx.default should be mimicked in read.xlslx.Workbook. Still, as can be seen, the logic requires some modifications and the output is affected (as can be seen by the many changes to the tests). I'm sure that some parts could be improved, but I do not see it as a viable option for a package this old, to change the default behavior