DS4PS / cpp-529-fall-2020

http://ds4ps.org/cpp-529-fall-2020/
0 stars 0 forks source link

Manipulating metro level statistics #20

Open ekmcintyre opened 4 years ago

ekmcintyre commented 4 years ago

To find the actual change and growth in median home values I used the following code:

real.mhv.change <- mhv.change - d$metro.mhv.change

real.mhv.growth <- mhv.growth - d$metro.mhv.growth

The metro variables are taken from the Metro Level Statistics in the tutorial which defines them as the average growth in median home values for the city.

I tried to use the new vectors in a correlation plot and got an error message

image

Should I not bother with the metro stats and use the regular mhv.growth and mhv.change vectors in my analysis?

ekmcintyre commented 4 years ago

The code in the picture was

d2 <- select( d, real.mhv.change, real.mhv.growth, p.white, p.own,  pov.rate )
d2$pov.rate <- log10( d2$pov.rate + 1 )

set.seed( 1234 ) 
d3 <- sample_n( d2, 10000 ) %>% na.omit()
pairs( d3, upper.panel=panel.cor, lower.panel=panel.smooth )

and the error message was

Error: Must subset columns with a valid subscript vector. x Can't convert from to due to loss of precision.

lecy commented 4 years ago

What are the data types and summary stats on d3?

summary( d3 )
str( d3 )
ekmcintyre commented 4 years ago

image

lecy commented 4 years ago

Are these variables all in d?

select( d, real.mhv.change, real.mhv.growth, p.white, p.own,  pov.rate )

Note the percentages are all scaled to 0 to 100, but your poverty rate is scaled 0 to 1.