hughjonesd / huxtable

An R package to create styled tables in multiple output formats, with a friendly, modern interface.
http://hughjonesd.github.io/huxtable
Other
321 stars 28 forks source link

Numbers being reformatted automatically even when autoformat is set to FALSE #240

Closed raiphilibert closed 1 year ago

raiphilibert commented 1 year ago

Describe the bug Numbers are being reformatted even if autoformat is set to false and number_format set to NA. Trailing zeroes of character strings are being lost.

Minimal Reproducible Example

library(huxtable)
options(scipen = 999)

df <-  data.frame(a = rep(1000, 6),
b = rep(1000.005, 6),
c = rep(0.0001, 6),
d = rep("1.0", 6),
e = rep("3.20 (s.e. 1.40)", 6),
f = c("<3.20", "6.0","5","6","10","<1.00"))
ht <- huxtable(
 df,
  autoformat = F
)

number_format(ht) <- NA
ht

quick_xlsx(ht) ### column d gets converted to 1 losing it's trailing zero

Expected behavior Expected behaviour in the excel file "1.0" and "6.0" would stay as is. But this is not the case. huxtable-output.xlsx System details R version 4.2.1 (2022-06-23 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 22000)

Matrix products: default

locale: [1] LC_COLLATE=English_Canada.utf8 LC_CTYPE=English_Canada.utf8 LC_MONETARY=English_Canada.utf8 [4] LC_NUMERIC=C LC_TIME=English_Canada.utf8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] huxtable_5.5.2

loaded via a namespace (and not attached): [1] Rcpp_1.0.9 rstudioapi_0.14 knitr_1.40 magrittr_2.0.3 tidyselect_1.2.0 R6_2.5.1 rlang_1.0.6
[8] fastmap_1.1.0 fansi_1.0.3 dplyr_1.0.10 tools_4.2.1 xfun_0.33 utf8_1.2.2 cli_3.4.1
[15] DBI_1.1.3 commonmark_1.8.1 assertthat_0.2.1 tibble_3.1.8 lifecycle_1.0.3 crayon_1.5.2 zip_2.2.1
[22] vctrs_0.5.0 cachem_1.0.6 memoise_2.0.1 glue_1.6.2 openxlsx_4.2.5 stringi_1.7.8 compiler_4.2.1
[29] pillar_1.8.1 generics_0.1.3 pkgconfig_2.0.3

hughjonesd commented 1 year ago

This happens because as_Workbook() checks if columns are numeric, and if so, calls as.numeric() on them. as.numeric("1.0") returns the double 1.

I don't think there's an easy fix here. We want to detect numbers, and we cannot automagically figure out what the correct format is; we have to supply numbers to openxlsx in the call to write_data. After that it's out of our hands.

The solution is to set number_format() to what you want. E.g. number_format(ht)[,4] <- 2 will give you 2 decimal places for column d.