DS4PS / cpp-529-master

Course files for CPP 529 Data Analytics Practicum focused on models of neighborhood change.
https://ds4ps.org/cpp-529-master/
2 stars 1 forks source link

Lab 06 #18

Open katiegentry07 opened 4 years ago

katiegentry07 commented 4 years ago

I'm struggling with labeling the clusters in Part 2, Step 2. I know the code needs to be changed to reflect the variables, but I'm unsure of where to go next.

stats <- 
  Census2010 %>% 
  group_by( cluster ) %>% 
  select(keep.these2010)%>% 
  summarise_each( funs(mean) )

t <- data.frame( t(stats), stringsAsFactors=F )
names(t) <- paste0( "GROUP.", 1:4 )
t <- t[-1,]

for( i in 1:4 )
{
  z <- t[,i]
  plot( rep(1,8), 1:8, bty="n", xlim=c(-1,1), 
        type="n", xaxt="n", yaxt="n",
        xlab="Score", ylab="",
        main=paste("GROUP",i) )
  abline( v=seq(0,.5,.1), lty=3, lwd=1.5, col="gray90" )
  segments( y0=1:8, x0=0, x1=100, col="gray70", lwd=2 )
  text( -0, 1:8, keep.these, cex=0.85, pos=2 )
  points( z, 1:8, pch=19, col="firebrick", cex=1.5 )
  axis( side=1, at=c(0,.3,.6), col.axis="gray", col="gray" )
}

The error states: Error in xy.coords(x, y) : 'x' and 'y' lengths differ

sunaynagoel commented 4 years ago

I'm struggling with labeling the clusters in Part 2, Step 2. I know the code needs to be changed to reflect the variables, but I'm unsure of where to go next.

stats <- 
  Census2010 %>% 
  group_by( cluster ) %>% 
  select(keep.these2010)%>% 
  summarise_each( funs(mean) )

t <- data.frame( t(stats), stringsAsFactors=F )
names(t) <- paste0( "GROUP.", 1:4 )
t <- t[-1,]

for( i in 1:4 )
{
  z <- t[,i]
  plot( rep(1,8), 1:8, bty="n", xlim=c(-1,1), 
        type="n", xaxt="n", yaxt="n",
        xlab="Score", ylab="",
        main=paste("GROUP",i) )
  abline( v=seq(0,.5,.1), lty=3, lwd=1.5, col="gray90" )
  segments( y0=1:8, x0=0, x1=100, col="gray70", lwd=2 )
  text( -0, 1:8, keep.these, cex=0.85, pos=2 )
  points( z, 1:8, pch=19, col="firebrick", cex=1.5 )
  axis( side=1, at=c(0,.3,.6), col.axis="gray", col="gray" )
}

The error states: Error in xy.coords(x, y) : 'x' and 'y' lengths differ

Hi. @katiegentry07 My guess is, in the original lab there were only 8 variables and for the answer there are more variables. I had 15. Changing all the y values from 8 to 15 might resolve this problem. Hope this helps. ~Sunayna

JaesaR commented 4 years ago

I am having trouble with the Sankey Transition Plot in step 4 of Part 2. When I enter the following code:

# Sankey Transition Plot
trn_mtrx <-
  with(TransDF,
   table(PredCluster, 
         cluster))

library(Gmisc)
transitionPlot(trn_mtrx, 
           type_of_arrow = "gradient")

It returns with:

"The minimum width reached and the arrow at box no. '3' to no. '2' will not be shown. This is due to the fact that the lwd will generate a falsely strong arrow. The minimum width reached and the arrow at box no. '3' to no. '4' will not be shown. This is due to the fact that the lwd will generate a falsely strong arrow. The minimum width reached and the arrow at box no. '3' to no. '2' will not be shown. This is due to the fact that the lwd will generate a falsely strong arrow. The minimum width reached and the arrow at box no. '3' to no. '4' will not be shown. This is due to the fact that the lwd will generate a falsely strong arrow. The minimum width reached and the arrow at box no. '4' to no. '1' will not be shown. This is due to the fact that the lwd will generate a falsely strong arrow. The minimum width reached and the arrow at box no. '4' to no. '1' will not be shown. This is due to the fact that the lwd will generate a falsely strong arrow."

How do I fix this?

castower commented 4 years ago

Did anyone else's clusters for the example part produce graphs in a different order than those on the HTML file? Just reading the .rmd file, I was confused by the labels, but after looking at the .html file, I realized they matched there. I didn't make any changes to my code, so I'm curious if the clusters just randomly change order?

Thanks! Courtney

castower commented 4 years ago

I am having trouble with the Sankey Transition Plot in step 4 of Part 2. When I enter the following code:

# Sankey Transition Plot
trn_mtrx <-
  with(TransDF,
   table(PredCluster, 
         cluster))

library(Gmisc)
transitionPlot(trn_mtrx, 
           type_of_arrow = "gradient")

It returns with:

"The minimum width reached and the arrow at box no. '3' to no. '2' will not be shown. This is due to the fact that the lwd will generate a falsely strong arrow. The minimum width reached and the arrow at box no. '3' to no. '4' will not be shown. This is due to the fact that the lwd will generate a falsely strong arrow. The minimum width reached and the arrow at box no. '3' to no. '2' will not be shown. This is due to the fact that the lwd will generate a falsely strong arrow. The minimum width reached and the arrow at box no. '3' to no. '4' will not be shown. This is due to the fact that the lwd will generate a falsely strong arrow. The minimum width reached and the arrow at box no. '4' to no. '1' will not be shown. This is due to the fact that the lwd will generate a falsely strong arrow. The minimum width reached and the arrow at box no. '4' to no. '1' will not be shown. This is due to the fact that the lwd will generate a falsely strong arrow."

How do I fix this?

I'm getting the exact same error. I've run it a couple of times and also re-started R and the error remains.

@Anthony-Howell-PhD

Edit: misspelled error

etbartell commented 4 years ago

@Anthony-Howell-PhD I'm getting a couple of error messages when I try running the .rmd file with my new code in place. I tried it again just with the original file and got the same error. I tried re-starting R and reloading the packages. Here is what it spits out at the end:

"Error: pandoc document conversion failed with error 11 In addition: Warning messages: 1: funs() is soft deprecated as of dplyr 0.8.0 Please use a list of either functions or lambdas:

Simple named list:

list(mean = mean, median = median)

Auto named with tibble::lst():

tibble::lst(mean, median)

Using lambdas

list(~ mean(., trim = .2), ~ median(., na.rm = TRUE)) This warning is displayed once per session. 2: In readRDS(gzcon(url(URL))) : strings not representable in native encoding will be translated to UTF-8 3: In readRDS(gzcon(url(URL))) : input string 'Doña Ana County' cannot be translated to UTF-8, is it valid in 'UTF-8' ? Execution halted"

castower commented 4 years ago

Hello all, to clarify we do not need to use the log function for the median household income in the second data set, correct?

AntJam-Howell commented 4 years ago

@JaesaR Is it still presenting the sankey figure or no figure produced? If it is just a warning with a figure, then you do not need to worry about it. If there is no figure, only an error, does the same error arise when you knit the original .rmd file or is it arising after you do your own code manipulations?

AntJam-Howell commented 4 years ago

@castower I want to ask you the same questions that I asked @JaesaR above.

AntJam-Howell commented 4 years ago

@etbartell can you knit the original .rmd file?

castower commented 4 years ago

@castower I want to ask you the same questions that I asked @JaesaR above.

@Anthony-Howell-PhD I did not try to knit the file, I just ran it in RStudio (where it would not produce a figure and gave me errors instead). However, the figure does appear in the knitted file. Thanks!

AntJam-Howell commented 4 years ago

Ok, sounds good. @JaesaR how about you.. can you knit and have the figure produced?

RickyDuran commented 4 years ago

@Anthony-Howell-PhD Hello, I am now having the same problem as Jaesa and Castower when trying to knit at Part 1 Step 3. Although, it is not letting me knit the file.

AntJam-Howell commented 4 years ago

@RickyDuran Jaesa and Castower were having problems with sankey plot. Step 1 of part 3 relates to present regression results with stargazer. Is stargazer step the problem? please show reproducible code and error you receive.

RickyDuran commented 4 years ago

I actually just got it to knit. I still have the error, when knitting at part 1 Step 3. I am not sure if that tells you anything about the other problems:

Output created: lab-06-duran.html Warning messages: 1: funs() is soft deprecated as of dplyr 0.8.0 Please use a list of either functions or lambdas:

Simple named list:

list(mean = mean, median = median)

Auto named with tibble::lst():

tibble::lst(mean, median)

Using lambdas

list(~ mean(., trim = .2), ~ median(., na.rm = TRUE)) This warning is displayed once per session. 2: In readRDS(gzcon(url(URL))) : strings not representable in native encoding will be translated to UTF-8 3: In readRDS(gzcon(url(URL))) : input string 'Doña Ana County' cannot be translated to UTF-8, is it valid in 'UTF-8' ?

AntJam-Howell commented 4 years ago

@RickyDuran ok sounds good. It’s a warning message due to package updates and some functions being deprecated. No need to worry about the warning message as long as it knits.

castower commented 4 years ago

Hello all, I'm a bit confused on how to read the Sankey Chart for the example.

My output after ordering by average initial house price:

2 | 116706.6 | 0.004294060 |  
3 | 120505.8 | 0.008834536 |  
4 | 138717.0 | 0.007864535 |  
1 | 210917.5 | 0.005460391

The sankey then has 4 listed as "low", 1 as "med-low", 2 as "med-high" and 3 as "high", but I would think based on the numbers that 2 would be low, 3 would be med-low, 4 would be med-high, and 1 would be high.

Am I missing something in my interpretation?

Thanks!

@Anthony-Howell-PhD

AntJam-Howell commented 4 years ago

@castower Nice catch and thanks for pointing that out. Indeed the ranking should be as you suggest: 2 would be low, 3 would be med-low, 4 would be med-high, and 1 would be high.

castower commented 4 years ago

Thanks for the clarification @Anthony-Howell-PhD!

On Wed, Nov 27, 2019, 1:39 PM Anthony-Howell-PhD notifications@github.com wrote:

@castower https://github.com/castower Nice catch and thanks for pointing that out. Indeed the ranking should be as you suggest: 2 would be low, 3 would be med-low, 4 would be med-high, and 1 would be high.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DS4PS/cpp-529-master/issues/18?email_source=notifications&email_token=AM6K2WT44NATNH5QOBD4K63QV3SKFA5CNFSM4JQ2Y632YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFK2CKA#issuecomment-559259944, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM6K2WVHFWDO26QOYH7RSEDQV3SKFANCNFSM4JQ2Y63Q .

AntJam-Howell commented 4 years ago

That output looks good to me.

On Wed, Nov 27, 2019 at 3:34 PM hos1995 notifications@github.com wrote:

I'm having trouble with stargazer and its output. I've checked my code and it seems to work but I don't get a clean output. Here is my code: library(stargazer) stargazer(Regression1, Regression2, Regression3, title="Sociodemographic Change Effect by Housing Prices",type='html',align=TRUE) Here is my output: `

Sociodemographic Change Effect by Housing Prices Dependent variable: House.Price.Change (1) (2) (3) Poverty -0.225 -0.189 -0.203 (0.010) (0.010) (0.015) Pop.Unemp -0.001 0.072 0.101 (0.020) (0.021) (0.021) Pop.Prof -0.195 -0.070 0.164 (0.011) (0.013) (0.015) Pop.Manufact 0.458 0.481 (0.012) (0.012) Poverty.White 0.125 0.176 (0.012) (0.018) Female.LaborForce -0.135 -0.237 (0.021) (0.022) Pop.SelfEmp -0.354 (0.020) Foreign.Born -0.258 (0.009) Veteran 0.031* (0.017) Pop.Black 0.141 (0.012) Pop.Hispanic 0.204 (0.017) Constant 0.748 0.668 0.683 (0.005) (0.008) (0.009) Observations 71,413 71,413 71,413 R2 0.009 0.033 0.054 Adjusted R2 0.009 0.033 0.054 Residual Std. Error 0.249 (df = 71409) 0.246 (df = 71406) 0.244 (df = 71401) F Statistic 225.557 (df = 3; 71409) 405.254 (df = 6; 71406) 370.286 (df = 11; 71401) Note: p<0.1; p<0.05; p<0.01 `

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DS4PS/cpp-529-master/issues/18?email_source=notifications&email_token=AMK2Y7ZKXVAXMTXPGWDREELQV3YVRA5CNFSM4JQ2Y632YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFK5LFY#issuecomment-559273367, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMK2Y72OIGPMKYALKWFMJD3QV3YVRANCNFSM4JQ2Y63Q .

-- Anthony Howell Asst. Prof. in Public Policy School of Public Affairs Arizona State University Faculty Profile https://isearch.asu.edu/profile/3501621 (CV https://www.dropbox.com/s/b1pxccpwxm6fats/Howell.CV.pdf?dl=0)

hos1995 commented 4 years ago

That output looks good to me. On Wed, Nov 27, 2019 at 3:34 PM hos1995 @.> wrote: I'm having trouble with stargazer and its output. I've checked my code and it seems to work but I don't get a clean output. Here is my code: library(stargazer) stargazer(Regression1, Regression2, Regression3, title="Sociodemographic Change Effect by Housing Prices",type='html',align=TRUE) Here is my output: ` Sociodemographic Change Effect by Housing Prices Dependent variable: House.Price.Change (1) (2) (3) Poverty -0.225 -0.189 -0.203 (0.010) (0.010) (0.015) Pop.Unemp -0.001 0.072 0.101 (0.020) (0.021) (0.021) Pop.Prof -0.195 -0.070 0.164 (0.011) (0.013) (0.015) Pop.Manufact 0.458 0.481 (0.012) (0.012) Poverty.White 0.125 0.176 (0.012) (0.018) Female.LaborForce -0.135 -0.237 (0.021) (0.022) Pop.SelfEmp -0.354 (0.020) Foreign.Born -0.258 (0.009) Veteran 0.031 (0.017) Pop.Black 0.141** (0.012) Pop.Hispanic 0.204 (0.017) Constant 0.748 0.668 0.683 (0.005) (0.008) (0.009) Observations 71,413 71,413 71,413 R2 0.009 0.033 0.054 Adjusted R2 0.009 0.033 0.054 Residual Std. Error 0.249 (df = 71409) 0.246 (df = 71406) 0.244 (df = 71401) F Statistic 225.557 (df = 3; 71409) 405.254 (df = 6; 71406) 370.286 (df = 11; 71401) Note: p<0.1; p<0.05; p<0.01 ` — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#18?email_source=notifications&email_token=AMK2Y7ZKXVAXMTXPGWDREELQV3YVRA5CNFSM4JQ2Y632YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFK5LFY#issuecomment-559273367>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMK2Y72OIGPMKYALKWFMJD3QV3YVRANCNFSM4JQ2Y63Q . -- Anthony Howell Asst. Prof. in Public Policy School of Public Affairs Arizona State University Faculty Profile https://isearch.asu.edu/profile/3501621 (CV https://www.dropbox.com/s/b1pxccpwxm6fats/Howell.CV.pdf?dl=0)

The output works when I embed the output into a generator but when I knit it only displays what looks like the following: image

So overall my code is working fine but I can't find a way for the visual to display correctly in my html document.

AntJam-Howell commented 4 years ago

Do you have results=‘asis’ included in the r heading as in the lab exercise?

On Wed, Nov 27, 2019 at 3:47 PM hos1995 notifications@github.com wrote:

That output looks good to me. … <#m-3117508820950411947> On Wed, Nov 27, 2019 at 3:34 PM hos1995 @.> wrote: I'm having trouble with stargazer and its output. I've checked my code and it seems to work but I don't get a clean output. Here is my code: library(stargazer) stargazer(Regression1, Regression2, Regression3, title="Sociodemographic Change Effect by Housing Prices",type='html',align=TRUE) Here is my output: Sociodemographic Change Effect by Housing Prices Dependent variable: House.Price.Change (1) (2) (3) Poverty -0.225 -0.189 -0.203 (0.010) (0.010) (0.015) Pop.Unemp -0.001 0.072 0.101 (0.020) (0.021) (0.021) Pop.Prof -0.195 -0.070 0.164 (0.011) (0.013) (0.015) Pop.Manufact 0.458 0.481 (0.012) (0.012) Poverty.White 0.125 0.176 (0.012) (0.018) Female.LaborForce -0.135 -0.237 (0.021) (0.022) Pop.SelfEmp -0.354 (0.020) Foreign.Born -0.258 (0.009) Veteran 0.031 (0.017) Pop.Black 0.141** (0.012) Pop.Hispanic 0.204 (0.017) Constant 0.748 0.668 0.683 (0.005) (0.008) (0.009) Observations 71,413 71,413 71,413 R2 0.009 0.033 0.054 Adjusted R2 0.009 0.033 0.054 Residual Std. Error 0.249 (df = 71409) 0.246 (df = 71406) 0.244 (df = 71401) F Statistic 225.557 (df = 3; 71409) 405.254 (df = 6; 71406) 370.286 (df = 11; 71401) Note: p<0.1; p<0.05; p<0.01 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#18 https://github.com/DS4PS/cpp-529-master/issues/18?email_source=notifications&email_token=AMK2Y7ZKXVAXMTXPGWDREELQV3YVRA5CNFSM4JQ2Y632YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFK5LFY#issuecomment-559273367>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMK2Y72OIGPMKYALKWFMJD3QV3YVRANCNFSM4JQ2Y63Q . -- Anthony Howell Asst. Prof. in Public Policy School of Public Affairs Arizona State University Faculty Profile https://isearch.asu.edu/profile/3501621 (CV https://www.dropbox.com/s/b1pxccpwxm6fats/Howell.CV.pdf?dl=0)

The output works when I embed the output into a generator but when I knit it only displays what looks like the following: [image: image] https://user-images.githubusercontent.com/54339972/69764306-1d3da080-112d-11ea-9177-27f8b1bf4fc7.png

So overall my code is working fine but I can't find a way for the visual to display correctly in my html document.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DS4PS/cpp-529-master/issues/18?email_source=notifications&email_token=AMK2Y77WXOIYOYX3MNOMM4DQV32H7A5CNFSM4JQ2Y632YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFK6BQQ#issuecomment-559276226, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMK2Y72XL2YP6JGWLW77DQLQV32H7ANCNFSM4JQ2Y63Q .

-- Anthony Howell School of Public Affairs Arizona State University (W) www.tonyjhowell.com

hos1995 commented 4 years ago

Do you have results=‘asis’ included in the r heading as in the lab exercise? On Wed, Nov 27, 2019 at 3:47 PM hos1995 @.> wrote: That output looks good to me. … <#m-3117508820950411947> On Wed, Nov 27, 2019 at 3:34 PM hos1995 @.> wrote: I'm having trouble with stargazer and its output. I've checked my code and it seems to work but I don't get a clean output. Here is my code: library(stargazer) stargazer(Regression1, Regression2, Regression3, title="Sociodemographic Change Effect by Housing Prices",type='html',align=TRUE) Here is my output: Sociodemographic Change Effect by Housing Prices Dependent variable: House.Price.Change (1) (2) (3) Poverty -0.225 -0.189 -0.203 (0.010) (0.010) (0.015) Pop.Unemp -0.001 0.072 0.101 (0.020) (0.021) (0.021) Pop.Prof -0.195 -0.070 0.164 (0.011) (0.013) (0.015) Pop.Manufact 0.458 0.481 (0.012) (0.012) Poverty.White 0.125 0.176 (0.012) (0.018) Female.LaborForce -0.135 -0.237 (0.021) (0.022) Pop.SelfEmp -0.354 (0.020) Foreign.Born -0.258 (0.009) Veteran 0.031* (0.017) Pop.Black 0.141 (0.012) Pop.Hispanic 0.204 (0.017) Constant 0.748 0.668 0.683 (0.005) (0.008) (0.009) Observations 71,413 71,413 71,413 R2 0.009 0.033 0.054 Adjusted R2 0.009 0.033 0.054 Residual Std. Error 0.249 (df = 71409) 0.246 (df = 71406) 0.244 (df = 71401) F Statistic 225.557 (df = 3; 71409) 405.254 (df = 6; 71406) 370.286 (df = 11; 71401) Note: p<0.1; p<0.05; p<0.01 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#18 <#18>?email_source=notifications&email_token=AMK2Y7ZKXVAXMTXPGWDREELQV3YVRA5CNFSM4JQ2Y632YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFK5LFY#issuecomment-559273367>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMK2Y72OIGPMKYALKWFMJD3QV3YVRANCNFSM4JQ2Y63Q . -- Anthony Howell Asst. Prof. in Public Policy School of Public Affairs Arizona State University Faculty Profile https://isearch.asu.edu/profile/3501621 (CV https://www.dropbox.com/s/b1pxccpwxm6fats/Howell.CV.pdf?dl=0) The output works when I embed the output into a generator but when I knit it only displays what looks like the following: [image: image] https://user-images.githubusercontent.com/54339972/69764306-1d3da080-112d-11ea-9177-27f8b1bf4fc7.png So overall my code is working fine but I can't find a way for the visual to display correctly in my html document. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#18?email_source=notifications&email_token=AMK2Y77WXOIYOYX3MNOMM4DQV32H7A5CNFSM4JQ2Y632YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFK6BQQ#issuecomment-559276226>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMK2Y72XL2YP6JGWLW77DQLQV32H7ANCNFSM4JQ2Y63Q . -- Anthony Howell School of Public Affairs Arizona State University (W) www.tonyjhowell.com

That worked! Thank you so much!

Jigarci3 commented 4 years ago

Hi All, I am currently stuck on Part 1, step 1 and I am hoping someone can point me in the right direction.

This is my code

library(plyr)
census.dats<-ddply(census,"TRTID10",summarise, 
  HousePriceChange = Median.HH.Value00/(Median.HH.Value10+1),
  ForeignBornPopChange= Foreign.Born00/(Foreign.Born10+1),
  ImmigrationChange= Recent.Immigration00/(Recent.Immigration10+1),
  PoorEnglishChange= Poor.English00/(Poor.English10+1),
  VeteranChange= Veteran00/(Veteran10+1),
  PovertyChange= Poverty00/(Poverty10+1), 
  WhitePovertyChange= Poverty.White00/(Poverty.White10+1),
  HispanicPovertyChange= Poverty.Hispanic00/(Poverty.Hispanic10+1),
  BlackPopChange= Pop.Black00/(Pop.Black10+1),
  HispanicPopChange= Pop.Hispanic00/(Pop.Hispanic10+1),
  UnemploymentChange= Pop.Unemp00/(Pop.Unemp10+1),
  ManufacturingPopChange= Pop.Manufact00/(Pop.Manufact10+1),
  SelfEmployedPopChange= Pop.SelfEmp00/ (Pop.SelfEmp10+1),
  ProfessionalPopChange= Pop.Prof00/(Pop.Prof10+1),
  FemaleLaborForceChange= Female.LaborForce00/(Female.LaborForce10+1))

I am getting the following error: Error in FUN(X[[i]], ...) : object 'TRTID10' not found.

I know I am missing loading something but not sure where...

castower commented 4 years ago

I believe in your first line of code, you need to change the "census" to "census.dats" after ddply

On Wed, Nov 27, 2019, 3:29 PM Joanna Garcia Arellano < notifications@github.com> wrote:

Hi All, I am currently stuck on Part 1, step 1 and I am hoping someone can point me in the right direction.

This is my code

library(plyr) census.dats<-ddply(census,"TRTID10",summarise, HousePriceChange = Median.HH.Value00/(Median.HH.Value10+1), ForeignBornPopChange= Foreign.Born00/(Foreign.Born10+1), ImmigrationChange= Recent.Immigration00/(Recent.Immigration10+1), PoorEnglishChange= Poor.English00/(Poor.English10+1), VeteranChange= Veteran00/(Veteran10+1), PovertyChange= Poverty00/(Poverty10+1), WhitePovertyChange= Poverty.White00/(Poverty.White10+1), HispanicPovertyChange= Poverty.Hispanic00/(Poverty.Hispanic10+1), BlackPopChange= Pop.Black00/(Pop.Black10+1), HispanicPopChange= Pop.Hispanic00/(Pop.Hispanic10+1), UnemploymentChange= Pop.Unemp00/(Pop.Unemp10+1), ManufacturingPopChange= Pop.Manufact00/(Pop.Manufact10+1), SelfEmployedPopChange= Pop.SelfEmp00/ (Pop.SelfEmp10+1), ProfessionalPopChange= Pop.Prof00/(Pop.Prof10+1), FemaleLaborForceChange= Female.LaborForce00/(Female.LaborForce10+1))

I am getting the following error: Error in FUN(X[[i]], ...) : object 'TRTID10' not found.

I know I am missing loading something but not sure where...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DS4PS/cpp-529-master/issues/18?email_source=notifications&email_token=AM6K2WTTHZ3X4B2J72G6P2DQV37FHA5CNFSM4JQ2Y632YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFLAEXQ#issuecomment-559284830, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM6K2WVTYWDZPQROUFEC23TQV37FHANCNFSM4JQ2Y63Q .

etbartell commented 4 years ago

I'm trying to do the cluster analysis for the 2010 data but it keeps telling me that I have "undefined columns selected". Here is my code:

library(mclust)

census.dats <- na.omit(census.dats)

keep.these <- c("Foreign.Born10", "Recent.Immigrant10", "Poor.English10", "Veteran10", "Poverty10", "Poverty.Black10", "Poverty.White10", "Poverty.Hispanic10", "Pop.Black10", "Pop.Hispanic10", "Pop.Unemp10", "PopManufact10", "Pop.SelfEmp10", "Pop.Prof10", "Female.LaborForce10")

#Run Cluster Analysis
mod2 <- Mclust(census.dats[keep.these], 
               G=4) # Set groups to 5, but you can remove this to let r split data into own groupings

#Add group classification to df
census.dats$cluster <- mod2$classification

The error message says: "Error in [.data.frame(census.dats, keep.these) : undefined columns selected"

Has anyone else had this issue or know how to get around it?

Jigarci3 commented 4 years ago

Hello, I am having similar issues as @RickyDuran and @hos1995 for Part 1, Step 3 with the stargazer.

I combined step 2 and 3 as in the original lab and this is my code

regression1 <-lm(HousePriceChange ~ ForeignBornPopChange + PovertyChange + UnemploymentChange, data = census.dats)

regression2 <-lm(HousePriceChange ~ VeteranChange + PovertyChange + PoorEnglishChange + ManufacturingPopChange, data = census.dats)

regression3<-lm(HousePriceChange ~ ForeignBornPopChange + ImmigrationChange + PoorEnglishChange + VeteranChange + PovertyChange + WhitePovertyChange + HispanicPovertyChange + BlackPopChange + HispanicPopChange + UnemploymentChange + ManufacturingPopChange + SelfEmployedPopChange + ProfessionalPopChange + FemaleLaborForceChange, data = census.dats )

library(stargazer)
stargazer(regression1, regression2, regression3, title= "Effect of Changes in Demographiscs on Housing Prices",type='html',align=TRUE) 

I made sure the results=asis is included in the r heading as well. I receive the following error when I try to knit:

Error in lm.fit (x,y, offset, singular.ok= singular.ok, ...) : 0 (non-NA) cases Calls: ... withCallingHandlers -> withVisible -> eval -> eval -> lm -> lm.fit In addition: Warning message: funs () is soft depreciated as of dplyr 0.8.0 Please use a list of either functions or lambdas: # Simple name list: list (mean = mean, median=median) # Auto named with 'tibble:: last()': tibble:: lst (mean, median) # Using lambdas list (~mean(., trim = .2), ~median (., na.rm= TRUE)) This warning is displayed once per session. Execution halted.

etbartell commented 4 years ago

Hello, I am having similar issues as @RickyDuran and @hos1995 for Part 1, Step 3 with the stargazer.

I combined step 2 and 3 as in the original lab and this is my code

regression1 <-lm(HousePriceChange ~ ForeignBornPopChange + PovertyChange + UnemploymentChange, data = census.dats)

regression2 <-lm(HousePriceChange ~ VeteranChange + PovertyChange + PoorEnglishChange + ManufacturingPopChange, data = census.dats)

regression3<-lm(HousePriceChange ~ ForeignBornPopChange + ImmigrationChange + PoorEnglishChange + VeteranChange + PovertyChange + WhitePovertyChange + HispanicPovertyChange + BlackPopChange + HispanicPopChange + UnemploymentChange + ManufacturingPopChange + SelfEmployedPopChange + ProfessionalPopChange + FemaleLaborForceChange, data = census.dats )

library(stargazer)
stargazer(regression1, regression2, regression3, title= "Effect of Changes in Demographiscs on Housing Prices",type='html',align=TRUE) 

I made sure the results=asis is included in the r heading as well. I receive the following error when I try to knit:

Error in lm.fit (x,y, offset, singular.ok= singular.ok, ...) : 0 (non-NA) cases Calls: ... withCallingHandlers -> withVisible -> eval -> eval -> lm -> lm.fit In addition: Warning message: funs () is soft depreciated as of dplyr 0.8.0 Please use a list of either functions or lambdas: # Simple name list: list (mean = mean, median=median) # Auto named with 'tibble:: last()': tibble:: lst (mean, median) # Using lambdas list (~mean(., trim = .2), ~median (., na.rm= TRUE)) This warning is displayed once per session. Execution halted.

I had a similar issue but when I added "message=F, warning=F" to the top of the chunk it ran fine.

castower commented 4 years ago

I'm trying to do the cluster analysis for the 2010 data but it keeps telling me that I have "undefined columns selected". Here is my code:

library(mclust)

census.dats <- na.omit(census.dats)

keep.these <- c("Foreign.Born10", "Recent.Immigrant10", "Poor.English10", "Veteran10", "Poverty10", "Poverty.Black10", "Poverty.White10", "Poverty.Hispanic10", "Pop.Black10", "Pop.Hispanic10", "Pop.Unemp10", "PopManufact10", "Pop.SelfEmp10", "Pop.Prof10", "Female.LaborForce10")

#Run Cluster Analysis
mod2 <- Mclust(census.dats[keep.these], 
               G=4) # Set groups to 5, but you can remove this to let r split data into own groupings

#Add group classification to df
census.dats$cluster <- mod2$classification

The error message says: "Error in [.data.frame(census.dats, keep.these) : undefined columns selected"

Has anyone else had this issue or know how to get around it?

@etbartell I think your error is that PopManufact10 is missing a period (should be Pop.Manufact10). It's tricky to put all the periods in the right places, I missed one a couple of times too and that fixed it.

AntJam-Howell commented 4 years ago

Error in lm.fit (x,y, offset, singular.ok= singular.ok, ..

This error is indicative that there is a problem with the actual regression model that you are trying to estimate. It’s possible one of the variables in your regression has all NAs. Run the summary command on your data frame and check if that is the case or not.

On Wed, Nov 27, 2019 at 6:09 PM Courtney notifications@github.com wrote:

I'm trying to do the cluster analysis for the 2010 data but it keeps telling me that I have "undefined columns selected". Here is my code:

library(mclust)

census.dats <- na.omit(census.dats)

keep.these <- c("Foreign.Born10", "Recent.Immigrant10", "Poor.English10", "Veteran10", "Poverty10", "Poverty.Black10", "Poverty.White10", "Poverty.Hispanic10", "Pop.Black10", "Pop.Hispanic10", "Pop.Unemp10", "PopManufact10", "Pop.SelfEmp10", "Pop.Prof10", "Female.LaborForce10")

Run Cluster Analysis

mod2 <- Mclust(census.dats[keep.these], G=4) # Set groups to 5, but you can remove this to let r split data into own groupings

Add group classification to df

census.dats$cluster <- mod2$classification

The error message says: "Error in [.data.frame(census.dats, keep.these) : undefined columns selected"

Has anyone else had this issue or know how to get around it?

@etbartell https://github.com/etbartell I think your error is that PopManufact10 is missing a period (should be Pop.Manufact10). It's tricky to put all the periods in the right places, I missed one a couple of times too and that fixed it.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DS4PS/cpp-529-master/issues/18?email_source=notifications&email_token=AMK2Y75CORM6CHCGZFSUTEDQV4KU3A5CNFSM4JQ2Y632YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFLEHTI#issuecomment-559301581, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMK2Y72NVTKKCG6MBIUE3GLQV4KU3ANCNFSM4JQ2Y63Q .

-- Anthony Howell School of Public Affairs Arizona State University (W) www.tonyjhowell.com

etbartell commented 4 years ago

@castower Wow I can't believe I didn't catch that. Thanks for the extra pair of eyes!

hos1995 commented 4 years ago

I'm trying to do the cluster analysis for the 2010 data but it keeps telling me that I have "undefined columns selected". Here is my code:

library(mclust)

census.dats <- na.omit(census.dats)

keep.these <- c("Foreign.Born10", "Recent.Immigrant10", "Poor.English10", "Veteran10", "Poverty10", "Poverty.Black10", "Poverty.White10", "Poverty.Hispanic10", "Pop.Black10", "Pop.Hispanic10", "Pop.Unemp10", "PopManufact10", "Pop.SelfEmp10", "Pop.Prof10", "Female.LaborForce10")

#Run Cluster Analysis
mod2 <- Mclust(census.dats[keep.these], 
               G=4) # Set groups to 5, but you can remove this to let r split data into own groupings

#Add group classification to df
census.dats$cluster <- mod2$classification

The error message says: "Error in [.data.frame(census.dats, keep.these) : undefined columns selected" Has anyone else had this issue or know how to get around it?

@etbartell I think your error is that PopManufact10 is missing a period (should be Pop.Manufact10). It's tricky to put all the periods in the right places, I missed one a couple of times too and that fixed it.

I'm having the same error code happen in the same place and I'm pretty sure that my calls with the keep.these variable is correct. Is there anything else that might be the issue? `library(mclust)

keep.these <-c("Foreign.Born10, Recent.Immigrant10, Poor.English10, Veteran10, Poverty10, Poverty.Black10, Poverty.White10, Poverty.Hispanic10, Pop.Black10, Pop.Hispanic10, Pop.Unemp10, Pop.Manufact10, Pop.SelfEmp10, Pop.Prof10, Female.LaborForce10")

Run Cluster Analysis

mod2 <- Mclust(census.dats[keep.these], G=5) # Set groups to 5, but you can remove this to let r split data into own groupings

summary(mod2, parameters = TRUE)

Add group classification to df

census.dats$cluster <- mod2$classification`

castower commented 4 years ago

@hos1995 all of your variables should be in quotations (" ") , you currently only have one at the beginning and end of the entire set of variables.

On Wed, Nov 27, 2019, 7:34 PM hos1995 notifications@github.com wrote:

I'm trying to do the cluster analysis for the 2010 data but it keeps telling me that I have "undefined columns selected". Here is my code:

library(mclust)

census.dats <- na.omit(census.dats)

keep.these <- c("Foreign.Born10", "Recent.Immigrant10", "Poor.English10", "Veteran10", "Poverty10", "Poverty.Black10", "Poverty.White10", "Poverty.Hispanic10", "Pop.Black10", "Pop.Hispanic10", "Pop.Unemp10", "PopManufact10", "Pop.SelfEmp10", "Pop.Prof10", "Female.LaborForce10")

Run Cluster Analysis

mod2 <- Mclust(census.dats[keep.these], G=4) # Set groups to 5, but you can remove this to let r split data into own groupings

Add group classification to df

census.dats$cluster <- mod2$classification

The error message says: "Error in [.data.frame(census.dats, keep.these) : undefined columns selected" Has anyone else had this issue or know how to get around it?

@etbartell https://github.com/etbartell I think your error is that PopManufact10 is missing a period (should be Pop.Manufact10). It's tricky to put all the periods in the right places, I missed one a couple of times too and that fixed it.

I'm having the same error code happen in the same place and I'm pretty sure that my calls with the keep.these variable is correct. Is there anything else that might be the issue? `library(mclust)

keep.these <-c("Foreign.Born10, Recent.Immigrant10, Poor.English10, Veteran10, Poverty10, Poverty.Black10, Poverty.White10, Poverty.Hispanic10, Pop.Black10, Pop.Hispanic10, Pop.Unemp10, Pop.Manufact10, Pop.SelfEmp10, Pop.Prof10, Female.LaborForce10")

Run Cluster Analysis

mod2 <- Mclust(census.dats[keep.these], G=5) # Set groups to 5, but you can remove this to let r split data into own groupings

summary(mod1, parameters = TRUE)

Add group classification to df

census.dats$cluster <- mod1$classification`

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DS4PS/cpp-529-master/issues/18?email_source=notifications&email_token=AM6K2WTBPBUMX5BV7OXFS7TQV434FA5CNFSM4JQ2Y632YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFLKXTA#issuecomment-559328204, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM6K2WTVSHA2DNDC4IJUGYDQV434FANCNFSM4JQ2Y63Q .

hos1995 commented 4 years ago

@castower Oh my gosh. That's how you know I've been staring at my code for too long. Thanks!

nbugliar commented 4 years ago

I'm trying to do the cluster analysis for the 2010 data but it keeps telling me that I have "undefined columns selected". Here is my code:

library(mclust)

census.dats <- na.omit(census.dats)

keep.these <- c("Foreign.Born10", "Recent.Immigrant10", "Poor.English10", "Veteran10", "Poverty10", "Poverty.Black10", "Poverty.White10", "Poverty.Hispanic10", "Pop.Black10", "Pop.Hispanic10", "Pop.Unemp10", "PopManufact10", "Pop.SelfEmp10", "Pop.Prof10", "Female.LaborForce10")

#Run Cluster Analysis
mod2 <- Mclust(census.dats[keep.these], 
               G=4) # Set groups to 5, but you can remove this to let r split data into own groupings

#Add group classification to df
census.dats$cluster <- mod2$classification

The error message says: "Error in [.data.frame(census.dats, keep.these) : undefined columns selected"

Has anyone else had this issue or know how to get around it?

Combed this over and again with the same problem - has anyone else had trouble or been able to navigate this issue? @etbartell @Anthony-Howell-PhD

AntJam-Howell commented 4 years ago

@nbugliar This looks like a duplicate problem same as @etbartell that @castower answered above: I think your error is that PopManufact10 is missing a period (should be Pop.Manufact10). It's tricky to put all the periods in the right places, I missed one a couple of times too and that fixed it.