What's your response variable and how will you calculate the smoking frequency as here the data set has the percentage for 4 types of different smoking frequency so think about how to wrangle the data set and get the desired format.
In your hypotheses, what's defined as urban, developed areas and very rural areas on the state level? Do you have a specific definition for it or do you have enough information so you can define it yourself? Same thing for the hypothesis 3, I don't quite see any information regarding the population from the data set itself, how will you incorporate population in your hypothesis testing process?
What's your response variable and how will you calculate the smoking frequency as here the data set has the percentage for 4 types of different smoking frequency so think about how to wrangle the data set and get the desired format. In your hypotheses, what's defined as urban, developed areas and very rural areas on the state level? Do you have a specific definition for it or do you have enough information so you can define it yourself? Same thing for the hypothesis 3, I don't quite see any information regarding the population from the data set itself, how will you incorporate population in your hypothesis testing process?