Format columns with colour
Unexpected rounding/formatting result/error when printing a tibble object #673

Open JohnnyZoomis opened 4 days ago

JohnnyZoomis commented 4 days ago

I have encountered an example of a tibble object being printed out, where the last (non-zero) digit appears to be wrongly truncated from the printing.

Here is the actual example R code which produces the issue. [I am using RStudio.]

library(compositions)   # this is on CRAN

data(ArcticLake)  # from compositions package;  this is a matrix so convert to tibble below

rounderr <- ArcticLake |> 
             as_tibble() |> 
             mutate(Total = sand + silt + clay) |> 
             filter(abs(Total - 100.0) > 0.05) 


The output is

ArcticLakeTotalPrinting> source("~/Documents/Statistics/Research/ArcticLakeTotalPrinting/R/ArcticLakeTotalBug.R")
── Attaching core tidyverse packages ────────────────────────────────────────────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ──────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package to force all conflicts to become errors
Welcome to compositions, a package for compositional data analysis.
Find an intro with "? compositions"

Attaching package: ‘compositions’

The following objects are masked from ‘package:stats’:

    anova, cor, cov, dist, var

The following object is masked from ‘package:graphics’:


The following objects are masked from ‘package:base’:

    %*%, norm, scale, scale.default

# A tibble: 5 × 5
   sand  silt  clay depth Total
  <dbl> <dbl> <dbl> <dbl> <dbl>
1  52.2  40.9   6.6  13    99.7
2   4.8  54.7  41    49.5 100. 
3   7.4  51.6  40.9  73.6  99.9
4   6.7  47.3  45.9  87.7  99.9
5   7.4  45.6  46.9  88.1  99.9

The last column in the second row should be 100.5. But clearly it has lost the tenths digit.

Using the View() function, I could explore the structure of the tibble in question visually. Here is a screenshot to show that the relevant value is actually stored correctly.


I checked the help pages, expecting to see this was a consequence of a pure '.5' being rounded, which is somewhat more subtle than run-of-the-mill rounding. I could not find any such information -- obviously if I have missed this I would be very grateful to have it pointed out!

Here is sessionInfo():

ArcticLakeTotalPrinting> sessionInfo()
R version 4.4.1 (2024-06-14)
Platform: x86_64-apple-darwin20
Running under: macOS Sonoma 14.6.1

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/New_York
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] compositions_2.0-8 lubridate_1.9.3    forcats_1.0.0      stringr_1.5.1      dplyr_1.1.4        purrr_1.0.2       
 [7] readr_2.1.5        tidyr_1.3.1        tibble_3.2.1       ggplot2_3.5.1      tidyverse_2.0.0   

loaded via a namespace (and not attached):
 [1] gtable_0.3.5        compiler_4.4.1      Rcpp_1.0.13         tidyselect_1.2.1    tinytex_0.53       
 [6] bayesm_3.1-6        scales_1.3.0        R6_2.5.1            generics_0.1.3      robustbase_0.99-4-1
[11] MASS_7.3-61         munsell_0.5.1       pillar_1.9.0        tzdb_0.4.0          rlang_1.1.4        
[16] utf8_1.2.4          Rttf2pt1_1.3.12     stringi_1.8.4       xfun_0.44           timechange_0.3.0   
[21] cli_3.6.3           withr_3.0.1         magrittr_2.0.3      grid_4.4.1          rstudioapi_0.16.0  
[26] hms_1.1.3           lifecycle_1.0.4     DEoptimR_1.1-3      vctrs_0.6.5         extrafont_0.19     
[31] tensorA_0.36.2.1    glue_1.8.0          extrafontdb_1.0     fansi_1.0.6         colorspace_2.1-1   
[36] tools_4.4.1         pkgconfig_2.0.3   

Here is my RStudio info:

RStudio 2024.09.0+375 "Cranberry Hibiscus" Release (c8fc7aee6dc218d5687553f9041c6b1e5ea268ff, 2024-09-16) for macOS
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) RStudio/2024.09.0+375 Chrome/124.0.6367.243 Electron/30.4.0 Safari/537.36, Quarto 1.5.57 (/Applications/quarto/bin/quarto)
krlmlr commented 2 days ago

Thanks. Is the following helpful: https://tibble.tidyverse.org/articles/numbers.html (linked from https://pillar.r-lib.org/articles/numbers.html) ? Is the behavior different from the documentation?


#> [1] 100

Created on 2024-10-17 with reprex v2.1.1

I see that the trailing dot is confusing, perhaps we should always show at least one decimal digit if we show the dot.

JohnnyZoomis commented 2 days ago

Ahh. Three sig figs would explain it. Apologies for taking up the team's time.

Perhaps I still missed it in the package help pages proper. But if not, maybe the behavior could be mentioned there somewhere. I am agnostic about the raised trailing digit issue, but that's my N=1 opinion :)

Thanks again.