DS4PS / cpp-528-fall-2020

Course shell for CPP 528 Foundations of Data Science III - Project Management
http://ds4ps.org/cpp-528-fall-2020/
1 stars 1 forks source link

Week 02 Lab: Anyone figure out what `globd` or `globg` mean? #11

Open cenuno opened 3 years ago

cenuno commented 3 years ago

I couldn't find either variable in the codebook nor had success using Google to find these variables.

If no one finds the answer feel free to move forward without supplying definitions for these columns.

lecy commented 3 years ago

These are categorical variables describing the racial composition of the neighborhoods (tracts).

GlobD10 GlobG10
bw White Black
wha Dual immig
bw White Black
wha Dual immig
bw White Black
wha Dual immig
bw White Black
wha Dual immig
wba Semi global
wh Single immig
bw White Black
wha Dual immig
bw White Black

They were included in the raw files, but I don't recall seeing documentation anywhere. So yes, please create your own descriptor or just ignore them.

There are a handful of variables that were not included in the codebook provided by the group that created the data. Bad practice, but also not uncommon if a project evolves over time.

cenuno commented 3 years ago

@lecy thank you for the context and explanation! This only prompts more questions for me, particularly:

For now I'll put these in my back pocket as something to explore once we move onto data exploration next week. Appreciate you shining a light on these variables!

lecy commented 3 years ago

I had the same questions. I'm sure there was some coherent methodology behind the creation of the variables, but I have no idea what it was.