Open aghaynes opened 3 years ago
Hi Alan,
yes, i have considered this compact tables, with just one line per variable. I did not implent it because having more than summary statistic for a variable was an important point of atable. So having more then one line per variable is currently hardcoded in atable and all helper-functions deal with this issue. Also adding new statistics and modifing atable's statistics gets trickier.
I have already done some attempts to put all summary statistics in a single line and remove the indentdation after calling atable, but was not satisfied by my results.
Factors with two levels do not exist, because there are always missing values as a third level. So when the counts of the missing values are ignored, this should be possible. Missing values are common in clinical data. But for reporting of the data some people are surprised to see them.
Perhaps a slim version of atable without blocking and without splitting and ignoring levels of factors after the first and having summary of numerics formated in one line would be possible. Also other summary functions must be formated for one line only.
But for this case an lapply of summary functions on the data.frame should also do the job, no atable needed...
When I find some time I will look at the code.
Regards Armin
Gesendet: Freitag, 06. November 2020 um 16:23 Uhr Von: "Alan Haynes" notifications@github.com An: "arminstroebel/atable" atable@noreply.github.com Cc: "Subscribed" subscribed@noreply.github.com Betreff: [arminstroebel/atable] "lighter weight" tables (#10)
Hi!
To the best on my knowledge, it's not possible at the moment, but is it possible to have the summary statistic on the same row as the variable label? It would make for more compact tables...
e.g.
hp mean (sd) wght mean (sd) automatic N (%)
It's a bit fiddly for factors of course (binaries you could just keep the highest level (as.numeric) or TRUE for logicals)
Thanks!
-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/arminstroebel/atable/issues/10
I uploaded Version 0.1.10 of atable. The version contains funtion atable_compact to adress this issue.
Example:
DD=atable::test_data
target_cols = c("Numeric", "Factor", "Split2") # Split2 has two and missings levels to show the new format
a = atable_compact(DD,
target_cols = target_cols,
group_col = "Group",
split_cols = NULL,
format_to = "Console"
See also file test_atable_compact for more examples.
Its not finished yet as I am not sure how to name the columns of the compact table, currently its just 'variable___' and 'tag'.
Also the formatting function in atable_options("format_statistics_compact.statistics_factor") should be changed when there are a lot of levels in the factor. Currently it pastes all percentages together in on line.
That's super Armin, thanks!
Regarding the NAs, maybe one solution is to allow the useNA option to be set in the statistics.factor function (by all means set the default to "always" as you currently have it). As you have ... everywhere, that shouldn't cause any problems I don't think.
atable_compact isnt exported at the moment (maybe that's intentional though?)
Something odd is happening with the formatting... I redefined statistics.factor to ignore NAs but then the formatting doesn't seem to work anymore... it only retains the first level of the factor...
data(mtcars)
library(atable)
atable_options(format_to = "console")
mtcars$cyl <- as.factor(mtcars$cyl)
atable:::atable_compact(disp + cyl~ am, data = mtcars)
variable___ tag 0 1 p
1 Observations 19 13 <NA>
2 disp Mean_SD 290 (110) 144 (87) 0.0013
3 cyl 4, 6, 8, NA 16% (3), 21% (4), 63% (12), 0% (0) 62% (8), 23% (3), 15% (2), 0% (0) 0.013
stat Effect Size (CI)
1 <NA> <NA>
2 0.69 1.4 (0.62; 2.3)
3 8.7 0.52 (0.22; 0.74)
Warnmeldungen:
1: In stats::ks.test(x, y, alternative = c("two.sided"), ...) :
cannot compute exact p-value with ties
2: In stats::chisq.test(group, value) :
Chi-squared approximation may be incorrect
stats.factor <- function (x, ...){
statistics_out <- table(x, useNA = "no")
statistics_out <- as.list(statistics_out)
class(statistics_out) <- c("statistics_factor", "list")
return(statistics_out)
}
atable_options(statistics.factor = stats.factor)
atable:::atable_compact(disp + cyl ~ am, data = mtcars)
variable___ tag 0 1 p stat Effect Size (CI)
1 Observations 19 13 <NA> <NA> <NA>
2 disp Mean_SD 290 (110) 144 (87) 0.0013 0.69 1.4 (0.62; 2.3)
3 cyl 4 16% (3) 62% (8) 0.013 8.7 0.52 (0.22; 0.74)
Warnmeldungen:
1: In stats::ks.test(x, y, alternative = c("two.sided"), ...) :
cannot compute exact p-value with ties
2: In stats::chisq.test(group, value) :
Chi-squared approximation may be incorrect
In general, I would say that the factor levels should still be on their own row (where there are more than 2 levels). That would make the table clearer...
Hi Alan,
Calling atable_compact implies that also different formatting-function are called, in your case:
atable_options("format_statistics_compact.statistics_factor")
This formatting-function is build to work togehter with table(...useNA="always")
. It assumes that one levels is NA and thus shows only the first levels when a Factor has three or less levels.
So currently in your example you must additionally adapt the formatting-function like this:
format_new = function(x, ...)
{
nn <- names(x)
value <- unlist(x)
total <- sum(value)
percent <- 100 * value/total
if(length(nn)<=2){
# return only first level, ignore the others
# As atable::statistics.factor calls table(..., useNA='always'), there is always NA in nn and thus three
# levels are the minimum, not two levels
# The counts of missing values will not be displayed, but are included in the percent
value <- paste0(atable_options("format_percent")(percent[1]), "% (", atable_options("format_numbers")(value[1]), ")")
format_statistics_out <- data.frame(tag = factor(nn[1], levels = nn[1]), value = value[1],
row.names = NULL, stringsAsFactors = FALSE, check.names = FALSE, fix.empty.names = FALSE)
return(format_statistics_out)
}
else{
# paste everything in one line
value <- paste0(atable_options("format_percent")(percent), "% (", atable_options("format_numbers")(value), ")")
value = paste(value, collapse = ", ")
nn = paste(nn, collapse = ", ")
format_statistics_out <- data.frame(tag = factor(nn, levels = nn), value = value,
row.names = NULL, stringsAsFactors = FALSE, check.names = FALSE, fix.empty.names = FALSE)
return(format_statistics_out)
}
}
atable_options("format_statistics_compact.statistics_factor" = format_new)
The only change compared to atable_options("format_statistics_compact.statistics_factor") is: length(nn)<=2 instead <=3
The normal version of atable works as expected, shows levels 4,6,8
atable:::atable(disp + cyl ~ am, data = mtcars)
I did not upload to CRAN because I expected some tweaks are upcoming and also did not run devtools::test()
I forgot one line in test_atable_compact, new upload on Github; devtools::test() should run quietly with it.
I still need a name for the column... Currently it is 'variable___'
I uploaded atable 0.1.10 to CRAN.
New functions are atable_compact
and atable_longitudinal
.
The vignette atable_usage and the help of these functions show how to use them.
Hi!
To the best on my knowledge, it's not possible at the moment, but is it possible to have the summary statistic on the same row as the variable label? It would make for more compact tables...
e.g.
It's a bit fiddly for factors of course (binaries you could just keep the highest level (as.numeric) or TRUE for logicals)
Thanks!