dtkaplan / TeachStatsWithR

Materials for the MOSAIC "Teaching Statistics with R and RStudio"
23 stars 35 forks source link

Would like better data set in R For Students Chapter #3

Open rpruim opened 11 years ago

rpruim commented 11 years ago

The old version made heavy use of the iris data set. We could stick with that, but I'd like to use something else. An idea data set would have a couple categorical variables, a couple quantitative variables, and a good story. In particular I'd like numerical summaries, histograms and a scatterplot which become interesting when split by a categorical variable. (iris was good for that).

The suggestion box is open.

nicholasjhorton commented 11 years ago

I'm partial to the HELPrct, and would be happy to provide more motivation/description as needed.

Just my $0.02,

Nick

On Apr 6, 2013, at 6:57 PM, Randall Pruim notifications@github.com wrote:

The old version made heavy use of the iris data set. We could stick with that, but I'd like to use something else. An idea data set would have a couple categorical variables, a couple quantitative variables, and a good story. In particular I'd like numerical summaries, histograms and a scatterplot which become interesting when split by a categorical variable. (iris was good for that).

The suggestion box is open.

— Reply to this email directly or view it on GitHub.

Nicholas Horton Department of Mathematics and Statistics, Smith College Clark Science Center, Northampton, MA 01063-0001 http://www.math.smith.edu/~nhorton

rpruim commented 11 years ago

What are the compelling graphical displays from this data set? Looking for (at least):

dtkaplan commented 11 years ago

I'm trying out the fastR::trebuchet data set in drafting the multivariable book. It's got just two categorical variables (object and form). Object has a 1-to-1 relationship with projectileWt, which is nice for demo purposes. Enough data to be interest, but short enough that scatter plots aren't too crowded. There's a nice but simple story for the multivariable modeling chapter in predicting distance.

In my draft, I'm loading fastR and using it as an excuse to say, "Don't be afraid to load packages to suit your current need."

nicholasjhorton commented 11 years ago

Models using HELPrct that I've demonstrated in the past have included predicting CESD (depressive symptoms) as a function of substance group (3 group), sex, homeless status (2 group) and MCS (continuous measure).

bwplot(cesd ~ sex)

or

densityplot(~ cesd, groups=sex)

are pretty useful.

xyplot(cesd ~ mcs, groups=substance)

is also worth considering.

Just my $0.02,

Nick

On Apr 10, 2013, at 1:47 PM, Randall Pruim notifications@github.com wrote:

What are the compelling graphical displays from this data set? Looking for (at least):

• scatter plot with groups • side-by-side boxplots or overlaid densityplots — Reply to this email directly or view it on GitHub.

Nicholas Horton Department of Mathematics and Statistics, Smith College Clark Science Center, Northampton, MA 01063-0001 http://www.math.smith.edu/~nhorton

rpruim commented 11 years ago

Thanks. I'll take a look.

Sent from my iPad

On Apr 11, 2013, at 6:38 PM, Nicholas Horton notifications@github.com wrote:

Models using HELPrct that I've demonstrated in the past have included predicting CESD (depressive symptoms) as a function of substance group (3 group), sex, homeless status (2 group) and MCS (continuous measure).

bwplot(cesd ~ sex)

or

densityplot(~ cesd, groups=sex)

are pretty useful.

xyplot(cesd ~ mcs, groups=substance)

is also worth considering.

Just my $0.02,

Nick

On Apr 10, 2013, at 1:47 PM, Randall Pruim notifications@github.com wrote:

What are the compelling graphical displays from this data set? Looking for (at least):

• scatter plot with groups • side-by-side boxplots or overlaid densityplots — Reply to this email directly or view it on GitHub.

Nicholas Horton Department of Mathematics and Statistics, Smith College Clark Science Center, Northampton, MA 01063-0001 http://www.math.smith.edu/~nhorton — Reply to this email directly or view it on GitHub.