The introduction should introduce your general research question and your data (where it came from, how it was collected, what are the cases, what are the variables, etc.).
What is their general research question: What is the relationship between high school graduation by state and life expectancy?
Is the general research question clear? If it is not clear, what questions do you have?
Yes it is clear!
N/A
Section 2 - Data:
[X ] Is the data in the /data folder?
[ ] Does the README include the dimensions and codebook for the data set?
[X ] Does the proposal include the output of glimpse() or skim() of the data frame.
Data suitability:
[ X] Does the dataset have at least 50 observations and between 10 to 20 variables (exceptions can be made).
[ X] Does the data set include a mix of categorical variables, discrete numerical variables, and continuous numerical variables.
[ X] What variables does the data include (list below): The data contains discrete numerical variables, continuous numerical variables, and categorical variables.
Section 3 - Data analysis plan:
[ X] Does the proposal include outcome (response, Y) and predictor (explanatory, X) variables they will use to answer your question? And/or the comparison groups they will use, if applicable.
The predictor variable is high school graduation rates, and the response variable is life expectancy.
Do the outcome and predictor variables and/or comparison groups make sense in the context of the question? Why or Why not?
We believe they do make sense in the context of the question because higher rates of education could affect future professions and hence, wealth. Wealth generally leads to better healthcare and healthier lives, so we could imagine it would affect life expectancy.
[ X] Does the proposal include some very preliminary exploratory data analysis, including some summary statistics and visualizations, along with some explanation on how they help you learn more about your data. (They can add to these later as they work on their project.)
[ X] Does the proposal include the statistical method(s) that they believe will be useful in answering your question(s). (They can update these later as they work on their project.)
[ X] Do they include what results from these specific statistical methods that are needed to support their hypothesized answer?
Reflections
What was something you found interesting about the project?
We found it interesting how they included other factors, not just high school education, and related them to life expectancy and health. They also looked at other studies to get an idea of what to expect from the data.
What ideas/feedback do you have for other things they may explore?
It could be interesting to see if the affect of a one point drop in graduation rates has a linear affect on life expectancy or if drastically lower rates have exponential effects.
**What kinds of plots should they consider to complete the project goal to create some kind of compelling visualization(s) of this data in R?
Plots could consist of scatter plots, or maybe a histogram if ignoring states, maybe a literal map ggplot.
Any additional feedback you'd like to give the other group:
We really liked what you are looking for and the data you are working with. Keep up the great work!
Clear presentation of question and good use of exploratory plots.
1 point deducted because you are missing the codebook for the dataset that explains what each variable is.
Please add the data and the code book to the data folder readme.
Feedback on overall question
I am concerned that the link between life expectancy and graduation rates will be difficult to establish in the time you have and with the dataset you have beyond a correlation. Although past research shows there is a link, they were also able to look at a range of factors and they were using a historic dataset where the link was determined retrospectively. I'm not sure it's reasonable to project the life expectancy outcomes for students who graduated in the last 10 years.
Rather than looking at the link between life expectancy and graduation rates. Why don't you explore the factors that influence graduation rates? You may be able to find other details about the school funding, and/or economic opportunities, class size, and school ranking that might generate hypotheses for why graduation rates are lower in certain states and/or for certain demographics. You can still justify the need to understand graduation rates because it is important for future job opportunities, and other aspects related to health like life expectancy without that being the primary focus.
Proposal Review
Reviewers Team Name: Team Shoes Date: 10/13/2021
Section 1 - Introduction:
The introduction should introduce your general research question and your data (where it came from, how it was collected, what are the cases, what are the variables, etc.).
What is their general research question: What is the relationship between high school graduation by state and life expectancy? Is the general research question clear? If it is not clear, what questions do you have?
Section 2 - Data:
Data suitability:
Section 3 - Data analysis plan:
[ X] Does the proposal include outcome (response, Y) and predictor (explanatory, X) variables they will use to answer your question? And/or the comparison groups they will use, if applicable. The predictor variable is high school graduation rates, and the response variable is life expectancy. Do the outcome and predictor variables and/or comparison groups make sense in the context of the question? Why or Why not?
[ X] Does the proposal include some very preliminary exploratory data analysis, including some summary statistics and visualizations, along with some explanation on how they help you learn more about your data. (They can add to these later as they work on their project.)
[ X] Does the proposal include the statistical method(s) that they believe will be useful in answering your question(s). (They can update these later as they work on their project.)
[ X] Do they include what results from these specific statistical methods that are needed to support their hypothesized answer?
Reflections
What was something you found interesting about the project?
**What kinds of plots should they consider to complete the project goal to create some kind of compelling visualization(s) of this data in R?
Any additional feedback you'd like to give the other group: