Open DarwinAwardWinner opened 1 year ago
I had a look through the code to see if I could implement this myself, but there were a few too many layers of indirection for me to follow. If you can point me to the appropriate place in the code, I can try implementing this when I have time.
Thanks. The prt package implements output in this way, see, e.g., https://github.com/nbenn/prt/blob/main/tests/testthat/_snaps/format.md .
CC @nbenn.
Interesting. So it looks like I could potentially define my own print method for data frames and/or tibbles that calls prt::format_dt
. Is there an easy way to determine if a given tibble's backend supports efficient random access so that I can avoid trying to e.g. get the tail of a database query result?
None that I'm aware of, perhaps you could implement some heuristics? Happy to review if you'd be willing to share an implementation.
I will definitely share if I figure it out. Do you have any opinions on how the options should be set up?
A minimal implementation for tibbles, meant to be put in ~/.Rprofile
:
print.tbl <- function (x, width = NULL, ..., n = NULL, max_extra_cols = NULL, max_footer_lines = NULL) {
tryCatch({
n_half <- if(!is.null(n)) ceiling(n/2)
prt:::cat_line(prt:::format_dt(x = x, ..., n = n_half, width = width, max_extra_cols = max_extra_cols, max_footer_lines = max_footer_lines))
}, error = \(...) pillar:::print.tbl(x = x, width = width, ..., n = n, max_extra_cols = max_extra_cols, max_footer_lines = max_footer_lines))
}
I also came up with something for base data frames, but I print them using the aforementioned S4Vectors code, since the dplyr/pillar stuff doesn't print row names, which can't be ignored for base data frames.
print.data.frame <- function(x, ...) {
tryCatch({
withr::with_options(
list(max.print = ncol(x) * 15),
S4Vectors:::.show_DataFrame(x)
)
}, error = \(...) base::print.data.frame(x = x, ...))
}
The S4Vectors package from Bioconductor implements an S4 class called DataFrame (which exists to allow S4 vectors as data frame columns, I believe). One of the nice features of this class is that when printing, it shows both the first and last few rows of the data frame, e.g.:
Created on 2023-09-28 with reprex v2.0.2
Would it be possible to implement this as an option in pillar, at least for tables whose tail is easily accessible (i.e. probably not tables representing database queries)? Overall I prefer the formatting of pillar, but often seeing both the head and tail of a table is useful, because if the table is sorted by a particular column, it may not be clear from just the head that this column varies, e.g.:
Created on 2023-09-28 with reprex v2.0.2
As for implementation, I imagine either a logical option to include the tail, in which case the number of rows to be printed would be split equally; or else a fraction between 0 and 1 indication the desired split of rows between head and tail. But maybe you have better ideas.