ProjectMOSAIC / mosaic

Project MOSAIC R package
http://mosaic-web.org/
93 stars 26 forks source link

Naming things when using prop() et al #692

Closed rpruim closed 5 years ago

rpruim commented 6 years ago

Dear Abby,

I need to teach the bootstrap for a proportion (among other things) this week. So I've been looking at my options for how to calculate sample proportions.

Depending on how you use prop(), df_stats(), props(), etc. we get a variety of different names for the results.

Perhaps there is nothing to be done at this point. I'm mostly happy with the names that df_stats() produces -- at least if you use props and not prop -- and renaming is easy in df_stats() as well. The one downside is that things are wrapped up into a data frame instead of a vector, so doing arithmetic with the computed values isn't quite as simple.

require(mosaic)
prop(~sex, data = HELPrct)
#>    female 
#> 0.2362031
prop(~sex, data = HELPrct, success = "male")
#>      male 
#> 0.7637969
prop(~sex | substance, data = HELPrct)
#> female.alcohol female.cocaine  female.heroin 
#>      0.2033898      0.2697368      0.2419355
df_stats(~sex, data = HELPrct, prop)
#>    prop_sex
#> 1 0.2362031
df_stats(~sex, data = HELPrct, prop, fargs = list(success = "male"))
#>    prop_sex
#> 1 0.7637969
df_stats(~sex | substance, data = HELPrct, prop)
#>   substance  prop_sex
#> 1   alcohol 0.2033898
#> 2   cocaine 0.2697368
#> 3    heroin 0.2419355
df_stats(~sex, data = HELPrct, props)
#>   prop_female prop_male
#> 1   0.2362031 0.7637969
df_stats(~sex, data = HELPrct, props, fargs = list(success = "male"))
#>   prop_female prop_male
#> 1   0.2362031 0.7637969
df_stats(~sex | substance, data = HELPrct, props)
#>   substance prop_female prop_male
#> 1   alcohol   0.2033898 0.7966102
#> 2   cocaine   0.2697368 0.7302632
#> 3    heroin   0.2419355 0.7580645

But if we wanted to risk some backward compatibility, we could modify the names produced by prop() to prepend prop_ and perhaps to turn the . into _among_. So we would get something like this:

prop(~sex | substance, data = HELPrct)
#> prop_female_among_alcohol prop_female_among_cocaine  prop_female_among_heroin 
#>                 0.2033898                 0.2697368                 0.2419355

Those are long names, but they are clear. Is it worth breaking old code to do this? (Old SBI code that relies on the current naming scheme will break.) Do we add a global option to control behavior so old code doesn't break but we don't have to request a new style naming each time we use prop()? Should be just be moving toward using df_stats() for everything?

At the very least, we might consider adding some arguments to prop() that allow us to control what sorts of names are produced.

--- conflicted

rpruim commented 5 years ago

Some of the naming has been modified a bit. Documenting below and closing this issue.

require(mosaic)

prop(~sex, data = HELPrct)
#> prop_female 
#>   0.2362031
## #>    female 
## #> 0.2362031

prop(~sex, data = HELPrct, success = "male")
#> prop_male 
#> 0.7637969
## #>      male 
## #> 0.7637969

prop(~sex | substance, data = HELPrct)
#> prop_female.alcohol prop_female.cocaine  prop_female.heroin 
#>           0.2033898           0.2697368           0.2419355
## #> female.alcohol female.cocaine  female.heroin 
## #>      0.2033898      0.2697368      0.2419355

df_stats(~sex, data = HELPrct, prop)
#>   prop_female
#> 1   0.2362031
## #>    prop_sex
## #> 1 0.2362031

df_stats(~sex, data = HELPrct, prop, fargs = list(success = "male"))
#>   prop_male
#> 1 0.7637969
## #>    prop_sex
## #> 1 0.7637969

df_stats(~sex | substance, data = HELPrct, prop)
#>   substance prop_female
#> 1   alcohol   0.2033898
#> 2   cocaine   0.2697368
#> 3    heroin   0.2419355
#>   substance  prop_sex
## #> 1   alcohol 0.2033898
## #> 2   cocaine 0.2697368
## #> 3    heroin 0.2419355

df_stats(~sex, data = HELPrct, props)
#>   prop_female prop_male
#> 1   0.2362031 0.7637969
## #>   prop_female prop_male
## #> 1   0.2362031 0.7637969

df_stats(~sex, data = HELPrct, props, fargs = list(success = "male"))
#>   prop_female prop_male
#> 1   0.2362031 0.7637969
## #>   prop_female prop_male
## #> 1   0.2362031 0.7637969

df_stats(~sex | substance, data = HELPrct, props)
#>   substance prop_female prop_male
#> 1   alcohol   0.2033898 0.7966102
#> 2   cocaine   0.2697368 0.7302632
#> 3    heroin   0.2419355 0.7580645
## #>   substance prop_female prop_male
## #> 1   alcohol   0.2033898 0.7966102
## #> 2   cocaine   0.2697368 0.7302632
## #> 3    heroin   0.2419355 0.7580645

Created on 2019-01-10 by the reprex package (v0.2.1)