test_df2 <- data.frame(
gender = factor(c("male", "female")),
smoke = factor(c(rep("yes", 5), rep("no", 5))),
age = factor(c("young", "old"))
)
cross_table(test_df, smoke~gender)
# now we recode a level to be missing
test_df2$gender[test_df2$gender == "female"] <- NA
# females still show up
cross_table(test_df2, smoke~gender)
# levels should be removed too
test_df2$gender <- factor(test_df2$gender, levels = "male")
cross_table(test_df2, smoke~gender)
Should we simply do layout_column(drop = T) to drop all unobserved factor levels? Or should we let the user specify which levels to remove?
For the first case the computation could make use of tidyr::complete(model_frame).
gmodels::CrossTable seems to drop unobserved factor levels. For exploratory analysis this is not optimal: you should notice, if some combinations were not observed in the data.
Maybe layout_column() could gain the argument drop:
drop can be TRUE or FALSE with default TRUE.
alternatively you can provide a character vector, specifying the levels to drop. (if not all levels should be dropped)
Think about the following case:
Should we simply do
layout_column(drop = T)
to drop all unobserved factor levels? Or should we let the user specify which levels to remove?For the first case the computation could make use of
tidyr::complete(model_frame)
.gmodels::CrossTable
seems to drop unobserved factor levels. For exploratory analysis this is not optimal: you should notice, if some combinations were not observed in the data.Maybe
layout_column()
could gain the argumentdrop
:drop
can beTRUE
orFALSE
with defaultTRUE
.