sta199-s23-2 / project-statisfactory

https://sta199-s23-2.github.io/project-statisfactory/
0 stars 1 forks source link

Methodology #5

Open elignesin opened 1 year ago

elignesin commented 1 year ago

I think you do a good job of explaining what you are doing with your graphics and why you are making each of the graphics that you are including, which is very good. I also think it makes sense the way you break them down and the order you are giving your graphics in. It seems, though, like a lot of your analysis is more exploratory, and given this project, you may want to consider doing a hypothesis test for age and for gender and for race, to ask if there is a statistically significant difference in the proportion of people who are shot and tasered versus shot based on any of those three variables. I think the last plot is also very good, and again you could consider using a hypothesis test on this variable set as well.

For the visualizations themselves:

  1. I think your visualization is very good, though I might consider using geom_scatter rather than geom_jitter (there's a subtle difference in them that you may not want). Overall, good visualization.
  2. Visualizations 2 and 3 are good though, because of the shared scaling, you can see that some of the facets in Visualization 2 are virtually empty. That's not necessarily a bad thing, as you can discuss it, but just something to be aware of. Also for these, the dark blue with alpha = 1 is a bit strong, you may want to use like alpha = 0.7 instead.
  3. Visualizations 4 and 5 should not be scatterplots. You have binary data here along with age as a numeric variable, you would be much better off using either overlaid density plots or overlaid histograms instead (I would use density plots). The geom_smooth line here doesn't really give anything, I'm not sure it's necessary.
  4. Visualization 6 is good, except you have a proportional stacked plot so it's not a "number of individuals" on the y-axis, it's a proportion.