Add `pair_rc()` to improve row/column pair lookups

StevenHibble-Concurrence commented 6 months ago

Closes #467.

I considered Szudzik's Elegant Pairing Function, but it quickly exceeds .Machine$integer.max. Instead, we create a double vector by taking the row position as the whole number and the column position as the decimal number. This assumes that float precision is better than 1e-7 and that the spreadsheet won't have more than 10M-1 columns.

(1, 1)     ->   1.0000001
(999, 999) -> 999.0000999

StevenHibble-Concurrence commented 6 months ago

I forgot to add some metrics. I took timing and total memory allocation using bench::mark(). Peak RAM usage is from peakRAM::peakRAM().

workbook size | metric | paste() | pair_rc() | change -- | -- | -- | -- | -- medium | time (seconds) | 7.2 | 6.4 | -11% medium | mem_alloc (MB) | 399.8 | 355.7 | -11% medium | peak_mem (MB) | 283.7 | 263.3 | -7% large | time (seconds) | 46.4 | 33.1 | -29% large | mem_alloc (MB) | 3696.6 | 3184.6 | -14% large | peak_mem (MB) | 449.5 | 395.2 | -12%

JanMarvin commented 3 months ago

Thanks!

ycphs / openxlsx

Add `pair_rc()` to improve row/column pair lookups #468