Open snr35 opened 1 year ago
**- Peer review by: Stat4
Names of team members that participated in this review: Sreya, Laura, Tina, Sathvika
Describe the goal of the project.**
Because police brutality and killings have become a national scale issue, the goal of the project is to find data and see trends to observe if BIPOC Americans are more at risk to be killed by the police than white Americans.
Data was sourced from the Washington Post repository on fatal police shootings between 2015-2020, which is dependent on curated news reports and thus may exclude necessary data such as gender and minority status. The data was originally collected by manually combing through local news reports; combining information from law enforcement websites, social media, and other databases (including Fatal Encounters and the "Killed by Police" project). Data collection started in 2015 spurred by a slew of fatal shootings, and the information was updated in 2022.
In order to visualize the discrimination in accounts of killings between BIPOC and white Americans, they decided to focus on 4 variables: age, shot, shot and tasered, and race. They wanted to visualize these 4 variables with 6 graphs.
In the first graph, the team wanted to utilize a box plot to visualize age and those shot/ shot and tasered. The goal of this visualization is to understand the role that age plays and if certain age groups are more vulnerable to violence.
The next two graphs focused on age, gender, and shot/ shot and tasered variables. One histogram is faceted by race, filled by age, and filters those who were shot. The second histogram is the same except it is filtered by those who were shot and filtered. These two graphs build upon the first graph and introduce the role race and age play in the killings.
The fourth and fifth graphs compare age, gender, and shot/ shot and tasered. The first graph focuses on females, while the second graph focuses on males. For the female graph, they put an age on the x-axis and used logistic regression to compare shot (0) vs shot and tasered (1). For the male graph, they also put an age on the x-axis and used logistic regression to compare shot (0) vs shot and tasered (1).
The sixth graph focuses on if the victim was fleeing or not, and if they fled, how they did this. The team made use of a histogram and put three qualitative scenarios for the x-axis (did not flee, fled by car, fled on foot) and filled it with the shot/ shot and tasered variable. They use the tools of multiple linear regression analysis to estimate the importance and relevance of the extra explanatory variable (fleeing) and the response variable (level of violence). The purpose of this graph was to help account for fleeing status as a potentially confounding variable.
For graph 1, there could be another way to represent the data between age, shot, and shot and tasered. The box plot seems harder to read and find the exact median age of people shot. This could be fixed by instead representing the data with a histogram/bar graph of age faceted by “shot” and “shot and tasered”. This would allow better readability in what the median age would be for both categories.
For graphs 4 and 5, the line of best fit seems to not add anything to the overall sense of understanding the difference between age and manner of violence. If you do choose to include the line of best fit, there should be a second one for the “Shot&Tasered” category, as right now there is not any. We feel that a histogram would better represent the data
The fifth graph can be improved visually by having the labels (“Shot&Tasered” and “Shot) either above or below the data points so that it is more clear to understand and comprehend.
There is a statistical concern with this project that the team did not directly address. While the total dataset has a sufficient number of data points, a potential confounding variable could be the number of data points within each “fleeing category,” which could explain why there are so many more people shot (much more visually apparent) for “not fleeing” than other forms of fleeing. For example, if the dataset had 75% “not fleeing” data points and only 3% “fled by foot,” then the sixth graph may be misleading. We recommend adding this information and highlighting this potentially confounding variable.
We are most interested in new visualizations that this team can make as this is a comprehensive data set. We are also interested in understanding whether fleeing manner would affect the shooting manner and there is one graph so far, but we would like to see more!
There were no issues with reproducibility. The project was able to render without any issues. The teammates that cloned and rendered (1-2) were Sathvika and Sreya.
We feel that including the reasoning for the graphs or at least a few sentences describing what the graph is about or what quick deductions we can make from the graph would help instead of scrolling to the bottom to view. Overall, the introduction was clear, however, it would help to have your research question formatted clearly in the beginning so it can be clear for anyone who is quickly glancing at your file to understand the gist of your project.
We learned from this team’s project the value of conducting visualizations between multiple variables. We are considering implementing more trend analysis, and graphs, focusing on more variables in our dataset than just “UN Region” and “Income Level,” in order to provide a more holistic and in-depth review.
Peer review by: Team 6
Names of team members that participated in this review:
Sigrid Real-Aguilar, Cris Navas, Miran Bhima, Austin Chang
Describe the goal of the project.
The goal of the project is to evaluate different factors (such as race and age) and types of police violence (i.e. taser). In addition, to explore the rate of fatalities with fatal police shootings and prove that the current data is disproportionate.
Describe the data used or collected, if any. If the proposal does not include the use of a specific dataset, comment on whether the project would be strengthened by the inclusion of a dataset. A person's age, race, fleeing method, and the type of police violence they were subject too were some of the variables collected.
Describe the approaches, tools, and methods that will be used. The team is doing a correlation analysis with a scatter plot. In addition, they have a bar graph and a box plot (with a scatter plot over it).
Provide constructive feedback on how the team might be able to improve their project. Make sure your feedback includes at least one comment on the statistical reasoning aspect of the project, but do feel free to comment on aspects beyond the reasoning as well. There are two visualizations (dot plots) for which it seems that the y-axis has been pushed to each extreme of the graph. Perhaps a box plot could represent this data better.
What aspect of this project are you most interested in and would like to see highlighted in the presentation. We are interested in understanding whether fleeing manner would affect the shooting manner - and whether shooting rates have changed throughout the years (also because of COVID). In addition, we'd be interested in seeing how this team presented new information (i.e. creating new connections with visualizations - something that you can't just find online).
Were you able to reproduce the project by clicking on Render Website once you cloned it? Were there any issues with reproducibility? It was a success!
Provide constructive feedback on any issues with file and/or code organization. We recommend putting the hypothesis in the 'introduction' section rather than the results.
What have you learned from this team's project that you are considering implementing in your own project? We really like the organization of the project. For example, our team is looking to add a 'limitations' section.
(Optional) Any further comments or feedback?
This was a very organized project, and very easy to understand.