DS4PS / cpp-529-fall-2020

http://ds4ps.org/cpp-529-fall-2020/
0 stars 0 forks source link

Lab-04- Identifying Neighborhood Clusters #12

Open gzbib opened 3 years ago

gzbib commented 3 years ago

Hello Sir @lecy

I got stuck while printing out the groups, I keep on getting a missing variable the "percent white non-hispanic" as attached. I tried changing the dimensions but it is not working, I tried changing the index of VARIABLE[-1] but it didn't work either.


df.pct <- sapply( d2, ntile, 100 )
d4 <- as.data.frame( df.pct )
d4$cluster <- as.factor( paste0("GROUP-",fit$classification) )

num.groups <- length( unique( fit$classification ) )

stats <- 
d4 %>% 
  group_by( cluster ) %>% 
  summarise_each( funs(mean) )

t <- data.frame( t(stats), stringsAsFactors=F )
names(t) <- paste0( "GROUP.", 1:num.groups )
t <- t[-1,]

for( i in 1:num.groups )
{
  z <- t[,i]
  plot( rep(1,30), 1:30, bty="n", xlim=c(-75,100), 
        type="n", xaxt="n", yaxt="n",
        xlab="Percentile", ylab="",
        main=paste("GROUP",i) )
  abline( v=seq(0,100,25), lty=7, lwd=7, col="gray90" )
  segments( y0=1:30, x0=0, x1=100, col="gray70", lwd=2 )
  text( -0.5, 1:30, data.dictionary$VARIABLE[-1], cex=0.85, pos=2 )
  points( z, 1:30, pch=19, col="firebrick", cex=1.5 )
  axis( side=1, at=c(0,50,100), col.axis="gray", col="gray" )
}

Capture - Groups

lecy commented 3 years ago

The code is running (no error) but you can't see the label?

Did you try changing fig-width and fig-height arguments in the code chunk?

lecy commented 3 years ago

The lab uses:

```{r, fig.width=10, fig.height=8}
gzbib commented 3 years ago

I tried now changing the width and height as you suggested but it didn't work. I changed VARIABLE [-1] to VARIABLE [-3] i got the below;


df.pct <- sapply( d2, ntile, 100 )
d4 <- as.data.frame( df.pct )
d4$cluster <- as.factor( paste0("GROUP-",fit$classification) )

num.groups <- length( unique( fit$classification ) )

stats <- 
d4 %>% 
  group_by( cluster ) %>% 
  summarise_each( funs(mean) )

t <- data.frame( t(stats), stringsAsFactors=F )
names(t) <- paste0( "GROUP.", 1:num.groups )
t <- t[-1,]

for( i in 1:num.groups )
{
  z <- t[,i]
  plot( rep(1,30), 1:30, bty="n", xlim=c(-75,100), 
        type="n", xaxt="n", yaxt="n",
        xlab="Percentile", ylab="",
        main=paste("GROUP",i) )
  abline( v=seq(0,100,25), lty=3, lwd=1.5, col="gray90" )
  segments( y0=1:30, x0=0, x1=100, col="gray70", lwd=2 )
  text( -0.5, 1:30, data.dictionary$VARIABLE[-3], cex=0.85, pos=2 )
  points( z, 1:30, pch=19, col="firebrick", cex=1.5 )
  axis( side=1, at=c(0,50,100), col.axis="gray", col="gray" )
}

Capture3

lecy commented 3 years ago

Hmm... I see what's happening. If the data dictionary got sorted you would be dropping different lines.

I think that VARIABLE[-1] was originally dropping the label row. I'm not sure what you are dropping here with [-3]?

So it's not a matter of things not fitting? It is that you are missing a variable? I don't completely understand the question.

gzbib commented 3 years ago

Hello Sir,

Yes the issue was a missing row. Although logically Variable [-3] means we are dropping more than a row but for some reason the missing variable appears when i run this.

On Thu, 12 Nov 2020, 12:59 am Jesse Lecy, notifications@github.com wrote:

Hmm... I see what's happening. If the data dictionary got sorted you would be dropping different lines.

I think that VARIABLE[-1] was originally dropping the label row. I'm not sure what you are dropping here with [-3]?

So it's not a matter of things not fitting? It is that you are missing a variable? I don't completely understand the question.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/DS4PS/cpp-529-fall-2020/issues/12#issuecomment-725708029, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOLY2MO4U3KCAHHQ7QHODIDSPMJLHANCNFSM4TSQ6F7Q .