easystats / datawizard

Magic potions to clean and transform your data 🧙
https://easystats.github.io/datawizard/
Other
216 stars 16 forks source link

Request for function that tabulates factors #45

Closed jmgirard closed 2 years ago

jmgirard commented 2 years ago

I enjoy using describe_distribution() for numerical vectors but would like an alternative for tabulating factors.

set.seed(2021)
x <- factor(sample(LETTERS[1:3], size = 100, replace = TRUE))
datawizard::describe_distribution(x)
#> Mean | SD | Min | Max | Skewness | Kurtosis |   n | n_Missing
#> -------------------------------------------------------------
#> NA   | NA |   A |   C |    -0.14 |     -1.3 | 100 |         0

Created on 2021-12-13 by the reprex package (v2.0.1)

I would propose something like this:

describe_factor(x) #or tabulate_factor(x)
#> Variable | Level | Count | Proportion
#> -----------------------------------
#> x | A | 26 | 0.26
#> x | B | 40 | 0.40
#> x | C | 34 | 0.34

If you like the idea, I can make a pull request to start this off.

IndrajeetPatil commented 2 years ago

Maybe we can have this discussion here? https://github.com/easystats/datawizard/issues/46

jmgirard commented 2 years ago

If you think the solution will be to implement this via describe_distribution.factor(), then let's move it. But if you think the solution will be a separate function (as @strengejacke seemed to suggest in #46), it might be better as a separate issue.

strengejacke commented 2 years ago

what about #156?

jmgirard commented 2 years ago

what about #156?

This is great, thanks! I like having both raw and valid percentages. Does a cumulative percentage make sense in the case of an unordered factor?

As a down-the-road, it might also be nice to have an option to collapse the blocks into a "long" format where the factor is indicated as another column, as in my example above.

strengejacke commented 2 years ago

What do you mean by "collapse into long format"?

strengejacke commented 2 years ago

Does a cumulative percentage make sense in the case of an unordered factor?

The function is not only for factors, and quite often, factor levels are still in a natural order even if not of type "ordinal".

jmgirard commented 2 years ago

What do you mean by "collapse into long format"?

Instead of:

Variable X 1 0.1 0.1 0.1 2 0.9 0.9 1.0

Variable Y 1 0.4 0.4 0.4 2 0.3 0.3 0.7 3 0.3 0.3 1.0

I mean:

X 1 0.1 0.1 0.1 X 2 0.9 0.9 1.0 Y 1 0.4 0.4 0.4 Y 2 0.3 0.3 0.7 Y 3 0.3 0.3 1.0

strengejacke commented 2 years ago

Ok, this is going to be a long example-post, but I just wanted to show all use-cases that came to my mind. Use collapse to produce collapsed tables.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(datawizard)

data(efc)

data_table(efc$e42dep)
#> elder's dependency (efc$e42dep) <categorical>
#> total N=100 valid N=97
#> 
#> Value |  N | Raw % | Valid % | Cumulative %
#> ------+----+-------+---------+-------------
#> 1     |  2 |  2.00 |    2.06 |         2.06
#> 2     |  4 |  4.00 |    4.12 |         6.19
#> 3     | 28 | 28.00 |   28.87 |        35.05
#> 4     | 63 | 63.00 |   64.95 |       100.00
#> <NA>  |  3 |  3.00 |    <NA> |         <NA>

data_table(efc, select = c("e16sex", "e42dep"))
#> elder's gender (e16sex) <numeric>
#> total N=100 valid N=100
#> 
#> Value |  N | Raw % | Valid % | Cumulative %
#> ------+----+-------+---------+-------------
#> 1     | 46 | 46.00 |   46.00 |        46.00
#> 2     | 54 | 54.00 |   54.00 |       100.00
#> <NA>  |  0 |  0.00 |    <NA> |         <NA>
#> 
#> elder's dependency (e42dep) <categorical>
#> total N=100 valid N=97
#> 
#> Value |  N | Raw % | Valid % | Cumulative %
#> ------+----+-------+---------+-------------
#> 1     |  2 |  2.00 |    2.06 |         2.06
#> 2     |  4 |  4.00 |    4.12 |         6.19
#> 3     | 28 | 28.00 |   28.87 |        35.05
#> 4     | 63 | 63.00 |   64.95 |       100.00
#> <NA>  |  3 |  3.00 |    <NA> |         <NA>

data_table(efc, select = c("e16sex", "e42dep"), collapse = TRUE)
#> # Frequency Table
#> 
#> Variable | Value |  N | Raw % | Valid % | Cumulative %
#> ---------+-------+----+-------+---------+-------------
#> e16sex   |     1 | 46 | 46.00 |   46.00 |        46.00
#>          |     2 | 54 | 54.00 |   54.00 |       100.00
#>          |  <NA> |  0 |  0.00 |    <NA> |         <NA>
#> ------------------------------------------------------
#> e42dep   |     1 |  2 |  2.00 |    2.06 |         2.06
#>          |     2 |  4 |  4.00 |    4.12 |         6.19
#>          |     3 | 28 | 28.00 |   28.87 |        35.05
#>          |     4 | 63 | 63.00 |   64.95 |       100.00
#>          |  <NA> |  3 |  3.00 |    <NA> |         <NA>
#> ------------------------------------------------------

efc |> 
  group_by(e16sex) |> 
  data_table("c172code")
#> carer's level of education (c172code) <numeric>
#> Grouped by e16sex (1)
#> total N=46 valid N=41
#> 
#> Value |  N | Raw % | Valid % | Cumulative %
#> ------+----+-------+---------+-------------
#> 1     |  5 | 10.87 |   12.20 |        12.20
#> 2     | 32 | 69.57 |   78.05 |        90.24
#> 3     |  4 |  8.70 |    9.76 |       100.00
#> <NA>  |  5 | 10.87 |    <NA> |         <NA>
#> 
#> carer's level of education (c172code) <numeric>
#> Grouped by e16sex (2)
#> total N=54 valid N=49
#> 
#> Value |  N | Raw % | Valid % | Cumulative %
#> ------+----+-------+---------+-------------
#> 1     |  3 |  5.56 |    6.12 |         6.12
#> 2     | 34 | 62.96 |   69.39 |        75.51
#> 3     | 12 | 22.22 |   24.49 |       100.00
#> <NA>  |  5 |  9.26 |    <NA> |         <NA>

efc |> 
  group_by(e16sex) |> 
  data_table("c172code", collapse = TRUE)
#> # Frequency Table
#> 
#> Variable |      Group | Value |  N | Raw % | Valid % | Cumulative %
#> ---------+------------+-------+----+-------+---------+-------------
#> c172code | e16sex (1) |     1 |  5 | 10.87 |   12.20 |        12.20
#>          |            |     2 | 32 | 69.57 |   78.05 |        90.24
#>          |            |     3 |  4 |  8.70 |    9.76 |       100.00
#>          |            |  <NA> |  5 | 10.87 |    <NA> |         <NA>
#> -------------------------------------------------------------------
#> c172code | e16sex (2) |     1 |  3 |  5.56 |    6.12 |         6.12
#>          |            |     2 | 34 | 62.96 |   69.39 |        75.51
#>          |            |     3 | 12 | 22.22 |   24.49 |       100.00
#>          |            |  <NA> |  5 |  9.26 |    <NA> |         <NA>
#> -------------------------------------------------------------------

efc |> 
  group_by(e16sex) |> 
  data_table(c("c172code", "e42dep"))
#> carer's level of education (c172code) <numeric>
#> Grouped by e16sex (1)
#> total N=46 valid N=41
#> 
#> Value |  N | Raw % | Valid % | Cumulative %
#> ------+----+-------+---------+-------------
#> 1     |  5 | 10.87 |   12.20 |        12.20
#> 2     | 32 | 69.57 |   78.05 |        90.24
#> 3     |  4 |  8.70 |    9.76 |       100.00
#> <NA>  |  5 | 10.87 |    <NA> |         <NA>
#> 
#> elder's dependency (e42dep) <categorical>
#> Grouped by e16sex (1)
#> total N=46 valid N=45
#> 
#> Value |  N | Raw % | Valid % | Cumulative %
#> ------+----+-------+---------+-------------
#> 1     |  2 |  4.35 |    4.44 |         4.44
#> 2     |  2 |  4.35 |    4.44 |         8.89
#> 3     |  8 | 17.39 |   17.78 |        26.67
#> 4     | 33 | 71.74 |   73.33 |       100.00
#> <NA>  |  1 |  2.17 |    <NA> |         <NA>
#> 
#> carer's level of education (c172code) <numeric>
#> Grouped by e16sex (2)
#> total N=54 valid N=49
#> 
#> Value |  N | Raw % | Valid % | Cumulative %
#> ------+----+-------+---------+-------------
#> 1     |  3 |  5.56 |    6.12 |         6.12
#> 2     | 34 | 62.96 |   69.39 |        75.51
#> 3     | 12 | 22.22 |   24.49 |       100.00
#> <NA>  |  5 |  9.26 |    <NA> |         <NA>
#> 
#> elder's dependency (e42dep) <categorical>
#> Grouped by e16sex (2)
#> total N=54 valid N=52
#> 
#> Value |  N | Raw % | Valid % | Cumulative %
#> ------+----+-------+---------+-------------
#> 1     |  0 |  0.00 |    0.00 |         0.00
#> 2     |  2 |  3.70 |    3.85 |         3.85
#> 3     | 20 | 37.04 |   38.46 |        42.31
#> 4     | 30 | 55.56 |   57.69 |       100.00
#> <NA>  |  2 |  3.70 |    <NA> |         <NA>

efc |> 
  group_by(e16sex) |> 
  data_table(c("c172code", "e42dep"), collapse = TRUE)
#> # Frequency Table
#> 
#> Variable |      Group | Value |  N | Raw % | Valid % | Cumulative %
#> ---------+------------+-------+----+-------+---------+-------------
#> c172code | e16sex (1) |     1 |  5 | 10.87 |   12.20 |        12.20
#>          |            |     2 | 32 | 69.57 |   78.05 |        90.24
#>          |            |     3 |  4 |  8.70 |    9.76 |       100.00
#>          |            |  <NA> |  5 | 10.87 |    <NA> |         <NA>
#> -------------------------------------------------------------------
#> e42dep   | e16sex (1) |     1 |  2 |  4.35 |    4.44 |         4.44
#>          |            |     2 |  2 |  4.35 |    4.44 |         8.89
#>          |            |     3 |  8 | 17.39 |   17.78 |        26.67
#>          |            |     4 | 33 | 71.74 |   73.33 |       100.00
#>          |            |  <NA> |  1 |  2.17 |    <NA> |         <NA>
#> -------------------------------------------------------------------
#> c172code | e16sex (2) |     1 |  3 |  5.56 |    6.12 |         6.12
#>          |            |     2 | 34 | 62.96 |   69.39 |        75.51
#>          |            |     3 | 12 | 22.22 |   24.49 |       100.00
#>          |            |  <NA> |  5 |  9.26 |    <NA> |         <NA>
#> -------------------------------------------------------------------
#> e42dep   | e16sex (2) |     1 |  0 |  0.00 |    0.00 |         0.00
#>          |            |     2 |  2 |  3.70 |    3.85 |         3.85
#>          |            |     3 | 20 | 37.04 |   38.46 |        42.31
#>          |            |     4 | 30 | 55.56 |   57.69 |       100.00
#>          |            |  <NA> |  2 |  3.70 |    <NA> |         <NA>
#> -------------------------------------------------------------------

efc |> 
  group_by(e16sex, c172code) |> 
  data_table("e42dep")
#> elder's dependency (e42dep) <categorical>
#> Grouped by e16sex (1), c172code (1)
#> total N=5 valid N=5
#> 
#> Value | N | Raw % | Valid % | Cumulative %
#> ------+---+-------+---------+-------------
#> 1     | 0 |  0.00 |    0.00 |         0.00
#> 2     | 0 |  0.00 |    0.00 |         0.00
#> 3     | 2 | 40.00 |   40.00 |        40.00
#> 4     | 3 | 60.00 |   60.00 |       100.00
#> <NA>  | 0 |  0.00 |    <NA> |         <NA>
#> 
#> elder's dependency (e42dep) <categorical>
#> Grouped by e16sex (1), c172code (2)
#> total N=32 valid N=32
#> 
#> Value |  N | Raw % | Valid % | Cumulative %
#> ------+----+-------+---------+-------------
#> 1     |  2 |  6.25 |    6.25 |         6.25
#> 2     |  2 |  6.25 |    6.25 |        12.50
#> 3     |  4 | 12.50 |   12.50 |        25.00
#> 4     | 24 | 75.00 |   75.00 |       100.00
#> <NA>  |  0 |  0.00 |    <NA> |         <NA>
#> 
#> elder's dependency (e42dep) <categorical>
#> Grouped by e16sex (1), c172code (3)
#> total N=4 valid N=4
#> 
#> Value | N | Raw % | Valid % | Cumulative %
#> ------+---+-------+---------+-------------
#> 1     | 0 |  0.00 |    0.00 |         0.00
#> 2     | 0 |  0.00 |    0.00 |         0.00
#> 3     | 1 | 25.00 |   25.00 |        25.00
#> 4     | 3 | 75.00 |   75.00 |       100.00
#> <NA>  | 0 |  0.00 |    <NA> |         <NA>
#> 
#> elder's dependency (e42dep) <categorical>
#> Grouped by e16sex (1), c172code (NA)
#> total N=5 valid N=4
#> 
#> Value | N | Raw % | Valid % | Cumulative %
#> ------+---+-------+---------+-------------
#> 1     | 0 |  0.00 |    0.00 |         0.00
#> 2     | 0 |  0.00 |    0.00 |         0.00
#> 3     | 1 | 20.00 |   25.00 |        25.00
#> 4     | 3 | 60.00 |   75.00 |       100.00
#> <NA>  | 1 | 20.00 |    <NA> |         <NA>
#> 
#> elder's dependency (e42dep) <categorical>
#> Grouped by e16sex (2), c172code (1)
#> total N=3 valid N=3
#> 
#> Value | N | Raw % | Valid % | Cumulative %
#> ------+---+-------+---------+-------------
#> 1     | 0 |  0.00 |    0.00 |         0.00
#> 2     | 0 |  0.00 |    0.00 |         0.00
#> 3     | 2 | 66.67 |   66.67 |        66.67
#> 4     | 1 | 33.33 |   33.33 |       100.00
#> <NA>  | 0 |  0.00 |    <NA> |         <NA>
#> 
#> elder's dependency (e42dep) <categorical>
#> Grouped by e16sex (2), c172code (2)
#> total N=34 valid N=32
#> 
#> Value |  N | Raw % | Valid % | Cumulative %
#> ------+----+-------+---------+-------------
#> 1     |  0 |  0.00 |    0.00 |         0.00
#> 2     |  2 |  5.88 |    6.25 |         6.25
#> 3     | 12 | 35.29 |   37.50 |        43.75
#> 4     | 18 | 52.94 |   56.25 |       100.00
#> <NA>  |  2 |  5.88 |    <NA> |         <NA>
#> 
#> elder's dependency (e42dep) <categorical>
#> Grouped by e16sex (2), c172code (3)
#> total N=12 valid N=12
#> 
#> Value | N | Raw % | Valid % | Cumulative %
#> ------+---+-------+---------+-------------
#> 1     | 0 |  0.00 |    0.00 |         0.00
#> 2     | 0 |  0.00 |    0.00 |         0.00
#> 3     | 5 | 41.67 |   41.67 |        41.67
#> 4     | 7 | 58.33 |   58.33 |       100.00
#> <NA>  | 0 |  0.00 |    <NA> |         <NA>
#> 
#> elder's dependency (e42dep) <categorical>
#> Grouped by e16sex (2), c172code (NA)
#> total N=5 valid N=5
#> 
#> Value | N | Raw % | Valid % | Cumulative %
#> ------+---+-------+---------+-------------
#> 1     | 0 |  0.00 |    0.00 |         0.00
#> 2     | 0 |  0.00 |    0.00 |         0.00
#> 3     | 1 | 20.00 |   20.00 |        20.00
#> 4     | 4 | 80.00 |   80.00 |       100.00
#> <NA>  | 0 |  0.00 |    <NA> |         <NA>

efc |> 
  group_by(e16sex, c172code) |> 
  data_table("e42dep", collapse = TRUE)
#> # Frequency Table
#> 
#> Variable |                     Group | Value |  N | Raw % | Valid % | Cumulative %
#> ---------+---------------------------+-------+----+-------+---------+-------------
#> e42dep   |  e16sex (1), c172code (1) |     1 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     2 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     3 |  2 | 40.00 |   40.00 |        40.00
#>          |                           |     4 |  3 | 60.00 |   60.00 |       100.00
#>          |                           |  <NA> |  0 |  0.00 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------
#> e42dep   |  e16sex (1), c172code (2) |     1 |  2 |  6.25 |    6.25 |         6.25
#>          |                           |     2 |  2 |  6.25 |    6.25 |        12.50
#>          |                           |     3 |  4 | 12.50 |   12.50 |        25.00
#>          |                           |     4 | 24 | 75.00 |   75.00 |       100.00
#>          |                           |  <NA> |  0 |  0.00 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------
#> e42dep   |  e16sex (1), c172code (3) |     1 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     2 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     3 |  1 | 25.00 |   25.00 |        25.00
#>          |                           |     4 |  3 | 75.00 |   75.00 |       100.00
#>          |                           |  <NA> |  0 |  0.00 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------
#> e42dep   | e16sex (1), c172code (NA) |     1 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     2 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     3 |  1 | 20.00 |   25.00 |        25.00
#>          |                           |     4 |  3 | 60.00 |   75.00 |       100.00
#>          |                           |  <NA> |  1 | 20.00 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------
#> e42dep   |  e16sex (2), c172code (1) |     1 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     2 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     3 |  2 | 66.67 |   66.67 |        66.67
#>          |                           |     4 |  1 | 33.33 |   33.33 |       100.00
#>          |                           |  <NA> |  0 |  0.00 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------
#> e42dep   |  e16sex (2), c172code (2) |     1 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     2 |  2 |  5.88 |    6.25 |         6.25
#>          |                           |     3 | 12 | 35.29 |   37.50 |        43.75
#>          |                           |     4 | 18 | 52.94 |   56.25 |       100.00
#>          |                           |  <NA> |  2 |  5.88 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------
#> e42dep   |  e16sex (2), c172code (3) |     1 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     2 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     3 |  5 | 41.67 |   41.67 |        41.67
#>          |                           |     4 |  7 | 58.33 |   58.33 |       100.00
#>          |                           |  <NA> |  0 |  0.00 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------
#> e42dep   | e16sex (2), c172code (NA) |     1 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     2 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     3 |  1 | 20.00 |   20.00 |        20.00
#>          |                           |     4 |  4 | 80.00 |   80.00 |       100.00
#>          |                           |  <NA> |  0 |  0.00 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------

Created on 2022-04-25 by the reprex package (v2.0.1)

strengejacke commented 2 years ago

And regarding the latter case, we could also drop unused levels - what should be the default here?

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(datawizard)

data(efc)
efc |> 
  group_by(e16sex, c172code) |> 
  data_table("e42dep", collapse = TRUE)
#> # Frequency Table
#> 
#> Variable |                     Group | Value |  N | Raw % | Valid % | Cumulative %
#> ---------+---------------------------+-------+----+-------+---------+-------------
#> e42dep   |  e16sex (1), c172code (1) |     1 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     2 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     3 |  2 | 40.00 |   40.00 |        40.00
#>          |                           |     4 |  3 | 60.00 |   60.00 |       100.00
#>          |                           |  <NA> |  0 |  0.00 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------
#> e42dep   |  e16sex (1), c172code (2) |     1 |  2 |  6.25 |    6.25 |         6.25
#>          |                           |     2 |  2 |  6.25 |    6.25 |        12.50
#>          |                           |     3 |  4 | 12.50 |   12.50 |        25.00
#>          |                           |     4 | 24 | 75.00 |   75.00 |       100.00
#>          |                           |  <NA> |  0 |  0.00 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------
#> e42dep   |  e16sex (1), c172code (3) |     1 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     2 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     3 |  1 | 25.00 |   25.00 |        25.00
#>          |                           |     4 |  3 | 75.00 |   75.00 |       100.00
#>          |                           |  <NA> |  0 |  0.00 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------
#> e42dep   | e16sex (1), c172code (NA) |     1 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     2 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     3 |  1 | 20.00 |   25.00 |        25.00
#>          |                           |     4 |  3 | 60.00 |   75.00 |       100.00
#>          |                           |  <NA> |  1 | 20.00 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------
#> e42dep   |  e16sex (2), c172code (1) |     1 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     2 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     3 |  2 | 66.67 |   66.67 |        66.67
#>          |                           |     4 |  1 | 33.33 |   33.33 |       100.00
#>          |                           |  <NA> |  0 |  0.00 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------
#> e42dep   |  e16sex (2), c172code (2) |     1 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     2 |  2 |  5.88 |    6.25 |         6.25
#>          |                           |     3 | 12 | 35.29 |   37.50 |        43.75
#>          |                           |     4 | 18 | 52.94 |   56.25 |       100.00
#>          |                           |  <NA> |  2 |  5.88 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------
#> e42dep   |  e16sex (2), c172code (3) |     1 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     2 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     3 |  5 | 41.67 |   41.67 |        41.67
#>          |                           |     4 |  7 | 58.33 |   58.33 |       100.00
#>          |                           |  <NA> |  0 |  0.00 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------
#> e42dep   | e16sex (2), c172code (NA) |     1 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     2 |  0 |  0.00 |    0.00 |         0.00
#>          |                           |     3 |  1 | 20.00 |   20.00 |        20.00
#>          |                           |     4 |  4 | 80.00 |   80.00 |       100.00
#>          |                           |  <NA> |  0 |  0.00 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------

efc |> 
  group_by(e16sex, c172code) |> 
  data_table("e42dep", collapse = TRUE, drop_levels = TRUE)
#> # Frequency Table
#> 
#> Variable |                     Group | Value |  N | Raw % | Valid % | Cumulative %
#> ---------+---------------------------+-------+----+-------+---------+-------------
#> e42dep   |  e16sex (1), c172code (1) |     3 |  2 | 40.00 |   40.00 |        40.00
#>          |                           |     4 |  3 | 60.00 |   60.00 |       100.00
#>          |                           |  <NA> |  0 |  0.00 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------
#> e42dep   |  e16sex (1), c172code (2) |     1 |  2 |  6.25 |    6.25 |         6.25
#>          |                           |     2 |  2 |  6.25 |    6.25 |        12.50
#>          |                           |     3 |  4 | 12.50 |   12.50 |        25.00
#>          |                           |     4 | 24 | 75.00 |   75.00 |       100.00
#>          |                           |  <NA> |  0 |  0.00 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------
#> e42dep   |  e16sex (1), c172code (3) |     3 |  1 | 25.00 |   25.00 |        25.00
#>          |                           |     4 |  3 | 75.00 |   75.00 |       100.00
#>          |                           |  <NA> |  0 |  0.00 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------
#> e42dep   | e16sex (1), c172code (NA) |     3 |  1 | 20.00 |   25.00 |        25.00
#>          |                           |     4 |  3 | 60.00 |   75.00 |       100.00
#>          |                           |  <NA> |  1 | 20.00 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------
#> e42dep   |  e16sex (2), c172code (1) |     3 |  2 | 66.67 |   66.67 |        66.67
#>          |                           |     4 |  1 | 33.33 |   33.33 |       100.00
#>          |                           |  <NA> |  0 |  0.00 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------
#> e42dep   |  e16sex (2), c172code (2) |     2 |  2 |  5.88 |    6.25 |         6.25
#>          |                           |     3 | 12 | 35.29 |   37.50 |        43.75
#>          |                           |     4 | 18 | 52.94 |   56.25 |       100.00
#>          |                           |  <NA> |  2 |  5.88 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------
#> e42dep   |  e16sex (2), c172code (3) |     3 |  5 | 41.67 |   41.67 |        41.67
#>          |                           |     4 |  7 | 58.33 |   58.33 |       100.00
#>          |                           |  <NA> |  0 |  0.00 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------
#> e42dep   | e16sex (2), c172code (NA) |     3 |  1 | 20.00 |   20.00 |        20.00
#>          |                           |     4 |  4 | 80.00 |   80.00 |       100.00
#>          |                           |  <NA> |  0 |  0.00 |    <NA> |         <NA>
#> ----------------------------------------------------------------------------------

Created on 2022-04-25 by the reprex package (v2.0.1)

strengejacke commented 2 years ago

will be closed in #156 still open for suggestions regarding the defaults of collapse and drop_levels arguments.