Open cernhofer opened 6 years ago
Do you link your multiple data sources down to the individual level?
Thanks!
I like your project. Good use of linear regression, decision trees, and random forests.
Physical environment sends direct message / ecological effect Linking data on demographic variables and perceived safety/disorder on teen pregnancy At zip code level (why so high...why not Census block? (with ACS?) Linear results presented Why not over all urban areas in all cities...ACS linked with data on perceived safety Super interesting / provocative association.
How do you control for the fact that built environments with a lot of disorder could have high correlations with other theorized causes for high teen birth rates?
I like the dataset you used, and applied deep learning and validate your result with different validation methods. It would be great if you can show the results with graphs :)
Nice pres. Possible extension: how do rural physical environments compare to urban ones (both high & low disorder)?
This is really cool! Can you explain further how the social and ecological aspects are distinguished? It seems that the crime rate var belongs to the social aspect but included in your model?
Beautiful presentation! I believe you need to convince the audience that you do not have a problem of omitted variables (everything correlated with your index of order).
Great, excited to see you using both regression and ML models. Thought with the regression relative to the outcome, if the outcomes are ordered survey response items, an ordered logistic regression or multinomial logit may be better depending on the dispersion of the DV.
Have you tried doing some cross validation?
@jfan3 I WISH! Individual level data connected to geographical context is hard/impossible to come by. I actually spent most of my time last quarter trying to find individual level data that I could add to the model. Thanks for the comment!
@bensoltoff Thanks for the comment!
@jamesallenevans thanks for the comment!
Basically the answer to everything is data constraints- the perinatal (teen birth) data I have is at the zip code level and the StreetScore data is available only for New York City and Boston- it's my next step to expand to this second city.
@bethbailey- Basically just by including somer of those alternative approaches as controls in the model. I think that there's a large weakness in my analysis in that I have no way of looking at individual/family level variance which I think could have large influences. However, the high R-squared and significance of neighbourhood order in the linear model gives me some confidence, at least, that the work isn't pure garbage.
Thanks for the comment!
@Alicechung Great comment- thanks! You're totally right!
@lpwarner Thanks for the comment- that would be super interesting. Unfortunately, the specific way I'm operationalizing neighbourhood order/disorder (StreetScore) has only been applied to a few select urban areas in the States. If they ever make their algorithm public I would do that in an instant!
@xiuyuanzhang Thanks for the comment- I included crime mostly as a control for at least some form of social influence but you're correct in that it definitely creates confusion when I describe my model as purely ecological.
@rodrigovaldes Thanks!
@jmausolf I don't know if I understand your question. My response variable is a quantitative rate of teen birth. I'm not quite sure if there's an intuitive method whereby I could divide this variable into two groups in order to warrant a logistic regression.
@ruixue-li Yes! I didn't talk about it due to time constraints but I was able to do cross validation for the models and results were fairly consistent. Thanks for the question!
While various neighborhood demographics (esp. disadvantage indicators) are controlled for, I am wondering if you have considered the effect of neighborhood gender and age compositions on teens birth rate. Presumably, a neighborhood consists of a higher proportion of teen males and younger teens might have a lower teen birth rate than one with more females and a more mature teen population. It would, therefore, be important to account for gender and age compositions at the neighborhood level.
What is your standardization methods so as to make each zip code area consistent?
🙌