WillemSleegers / tidystats-v0.3

R package to produce a tidy output file of statistical models.
Other
22 stars 2 forks source link

Test descriptives functions #42

Closed WillemSleegers closed 6 years ago

WillemSleegers commented 7 years ago

I have added several functions to deal with descriptives, rather than statistical models.

You can now use the function descriptives() for continuous data, and frequencies() for count data. You may turn the output of these functions into a tidy data frame using tidy_descriptives(), which is done for you when you add the output to a list using add_descriptives().

Note that I have switched the arguments in add_descriptives(), it first expects the output, then the data frame to add it to.

It would be nice if some people could help me test these functions out and let me know what you think of it.

UPDATE:

I changed a few things. First, descriptives() has been renamed to describe(). This masks the same function for the psych package, but meh, describe is a better name.

Second, I got rid of frequencies() by combining it with the describe() function. describe() now checks what kind of variable is being supplied and depending on that it either provides statistics suitable for numeric variables (e.g., mean, sd, etc.) or frequency statistics (e.g., n, percentages).

Now, there are some issues though. I'll provide some examples below using the mpg data set.

1 numeric variable, no groups:

Code: describe(mpg, cyl:

Output:

# A tibble: 1 x 11
    var missing     n     M    SD    SE   min   max range median  mode
  <chr>   <int> <int> <dbl> <dbl> <dbl> <int> <int> <int>  <dbl> <int>
1   cty       0   234    17   4.3  0.28     9    35    26     17    18

1 numeric variable, 1 group:

Code: describe(mpg, cty, year)

Output:

# A tibble: 2 x 13
    var    by group missing     n     M    SD    SE   min   max range median  mode
* <chr> <chr> <chr>   <int> <int> <dbl> <dbl> <dbl> <int> <int> <int>  <int> <int>
1   cty  year  1999       0   117    17   4.5  0.41    11    35    24     17    18
2   cty  year  2008       0   117    17   4.1  0.37     9    28    19     17    13

1 numeric variable, 2 groups:

Code: describe(mpg, cty, year, cyl)

Output:

# A tibble: 7 x 13
    var       by  group missing     n     M    SD    SE   min   max range median  mode
* <chr>    <chr>  <chr>   <int> <int> <dbl> <dbl> <dbl> <int> <int> <int>  <dbl> <int>
1   cty year-cyl 1999-4       0    45    21  4.24  0.63    15    35    20     19    21
2   cty year-cyl 1999-6       0    45    16  1.67  0.25    13    19     6     16    18
3   cty year-cyl 1999-8       0    27    12  1.65  0.32    11    16     5     11    11
4   cty year-cyl 2008-4       0    36    21  2.29  0.38    17    28    11     21    21
5   cty year-cyl 2008-5       0     4    20  0.58  0.29    20    21     1     20    21
6   cty year-cyl 2008-6       0    34    16  1.91  0.33    11    19     8     17    17
7   cty year-cyl 2008-8       0    43    13  1.88  0.29     9    16     7     13    13

Some questions I have:

# A tibble: 7 x 13
    var by_1 by_2 group_1 group_2  missing     n     M    SD    SE   min   max range median  mode
* <chr> <chr> <chr> <chr> <chr>   <int> <int> <dbl> <dbl> <dbl> <int> <int> <int>  <dbl> <int>
1   cty year cyl 1999 4       0    45    21  4.24  0.63    15    35    20     19    21
2   cty year cyl 1999 6       0    45    16  1.67  0.25    13    19     6     16    18
3   cty year cyl 1999 8       0    27    12  1.65  0.32    11    16     5     11    11
4   cty year cyl 2008 4       0    36    21  2.29  0.38    17    28    11     21    21
5   cty year cyl 2008 5       0     4    20  0.58  0.29    20    21     1     20    21
6   cty year cyl 2008 6       0    34    16  1.91  0.33    11    19     8     17    17
7   cty year cyl 2008 8       0    43    13  1.88  0.29     9    16     7     13    13
ghost commented 7 years ago

It seems to gather descriptives one time too often?

library(devtools)
# install_github("WillemSleegers/tidystats")
library(tidystats)
library(tidyverse)

iris.descriptives <- list()

descriptives(iris, Sepal.Length, Species) -> 
  Species.descriptives

Species.descriptives %>% 
  tidy_descriptives() %>%
  add_descriptives(list = iris.descriptives,
                   identifier = "Species") -> 
  iris.descriptives

iris.descriptives %>%
  .$Species %>%
  filter(var == "Sepal.Length") %>%
  filter(group == "versicolor")
WillemSleegers commented 7 years ago

Yup, because you don't need to do tidy_descriptives(). add_descriptives() calls that function, so it's more of an 'under the hood' function.

WillemSleegers commented 6 years ago

Closing this issue because I have completely redesigned the way creating and adding descriptives works.

ghost commented 6 years ago

Lekker bezig!

Op 21 sep. 2017 01:43 schreef "WillemSleegers" notifications@github.com:

Closing this issue because I have completely redesigned the way creating and adding descriptives works.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/WillemSleegers/tidystats/issues/42#issuecomment-331010963, or mute the thread https://github.com/notifications/unsubscribe-auth/AXRtR-V5LJvPvbvlTvHemyU9e_69rlrsks5skaMBgaJpZM4O8nRG .

WillemSleegers commented 6 years ago

Check out the new README: https://github.com/WillemSleegers/tidystats https://github.com/WillemSleegers/tidystats

On 21 Sep 2017, at 19:09, Paul van der Laken notifications@github.com wrote:

Lekker bezig!

Op 21 sep. 2017 01:43 schreef "WillemSleegers" notifications@github.com:

Closing this issue because I have completely redesigned the way creating and adding descriptives works.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/WillemSleegers/tidystats/issues/42#issuecomment-331010963, or mute the thread https://github.com/notifications/unsubscribe-auth/AXRtR-V5LJvPvbvlTvHemyU9e_69rlrsks5skaMBgaJpZM4O8nRG .

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/WillemSleegers/tidystats/issues/42#issuecomment-331221167, or mute the thread https://github.com/notifications/unsubscribe-auth/AEMb2p-qF-FQUGpLTJPpD3rBEk3W00Oaks5skphggaJpZM4O8nRG.