Open Rachel-Judd opened 11 months ago
Did you have?
data.dictonary <- data.frame(
LABEL = c(
"tractid", "pnhwht12", "pnhblk12", "phisp12", "pntv12", "pfb12",
"polang12", "phs12", "pcol12", "punemp12", "pflabf12", "pprof12",
"pmanuf12", "pvet12", "psemp12", "hinc12", "incpc12", "ppov12",
"pown12", "pvac12", "pmulti12", "mrent12", "mhmval12", "p30old12",
"p10yrs12", "p18und12", "p60up12", "p75up12", "pmar12", "pwds12",
"pfhh12"
),
VARIABLE = c(
"GEOID", "Percent white, non-Hispanic", "Percent black, non-Hispanic",
"Percent Hispanic", "Percent Native American race", "Percent foreign born",
"Percent speaking other language at home, age 5 plus",
"Percent with high school degree or less", "Percent with 4-year college degree or more",
"Percent unemployed", "Percent female labor force participation",
"Percent professional employees", "Percent manufacturing employees",
"Percent veteran", "Percent self-employed", "Median HH income, total",
"Per capita income", "Percent in poverty, total", "Percent owner-occupied units",
"Percent vacant units", "Percent multi-family units", "Median rent",
"Median home value", "Percent structures more than 30 years old",
"Percent HH in neighborhood 10 years or less", "Percent 17 and under, total",
"Percent 60 and older, total", "Percent 75 and older, total",
"Percent currently married, not separated", "Percent widowed, divorced and separated",
"Percent female-headed families with children"
)
)
df.pct <- sapply( d2, ntile, 100 )
d4 <- as.data.frame( df.pct )
d4$cluster <- as.factor( paste0("GROUP-",fit$classification) )
num.groups <- length( unique( fit$classification ) )
stats <-
d4 %>%
group_by( cluster ) %>%
summarise_each( funs(mean) )
t <- data.frame( t(stats), stringsAsFactors=F )
names(t) <- paste0( "GROUP.", 1:num.groups )
t <- t[-1,]
for( i in 1:num.groups )
{
z <- t[,i]
plot( rep(1,30), 1:30, bty="n", xlim=c(-75,100),
type="n", xaxt="n", yaxt="n",
xlab="Percentile", ylab="",
main=paste("GROUP",i) )
abline( v=seq(0,100,25), lty=3, lwd=1.5, col="gray90" )
segments( y0=1:30, x0=0, x1=100, col="gray70", lwd=2 )
text( -0.2, 1:30, data.dictionary$VARIABLE[-1], cex=0.85, pos=2 )
points( z, 1:30, pch=19, col="firebrick", cex=1.5 )
axis( side=1, at=c(0,50,100), col.axis="gray", col="gray" )
}
Thank you. Yes, when I inlclude everything above the results are still off.
Hi, @antjam-howell For module 4, to create the initial groups in part 1, should I have added the dictionary? I tried running it without including the data.dictionary assignment included in part 2, but it was saying data.dictionary was not found. When I include the data.dictionary chunk, the groups seem incorrect. "Black, non-hispanic" is listed twice and there are many conflicting results. Did anyone else come across this?