DS4PS / cpp-529-master

Course files for CPP 529 Data Analytics Practicum focused on models of neighborhood change.
https://ds4ps.org/cpp-529-master/
2 stars 1 forks source link

Lab 02: Transforming Data - R gets stuck processing at mutate()? #29

Open swest235 opened 5 months ago

swest235 commented 5 months ago

@AntJam-Howell I'm running my code chunk in R to transform my data, but after waiting 10+ minutes, I get the results you see in the first screenshot attached here. After a while, it displays the results - not sure why those are displaying to begin with - and even then it still shows the code is running. I click stop and get the popup notice to terminate R. Could you please offer some assistance here? I am not sure why it seems to be getting stuck.

library(tidycensus)
library(tidyverse)
library(viridis)
library(dplyr)
census_api_key("ada48681b172e752512698d7c4a80ced317e9f34")

#Loading Variables: 
VarSearch <- load_variables(2017, "acs5", cache = TRUE)

head(VarSearch)

dat <- c(Median_Value = "B25097_001", 
          Median_Income = "B19013_001")
CenDF <- get_acs(geography = "county", 
                 year = 2017, 
                 survey = "acs5", 
                 variables = dat,
                 geometry = T)

head(CenDF)

CenDF <- CenDF %>% 
  mutate(variable=case_when(
    variable == "B25097_001" ~ "HouseValue",
    variable == "B19013_001" ~ "HHIncome")) %>% 
  select(-moe) %>% 
  spread(variable, estimate) %>% 
  mutate(HHInc_HousePrice_Ratio = round(HouseValue/HHIncome, 2))

image

image

image

swest235 commented 5 months ago

I think I figured this part out. I believe it was because I had previously assigned the codes to the dat variable. I removed that assignment and just plugged the codes into the get_acs() and it seems to be working just fine now.

I do, however, have two additional questions:

1) Why does using rev(order) show NA for the top counties, but using the arrange() doesn't? What about the arrange() filters out NA's as opposed to the order()?

order() to get largest ratios: image

arrange() to get largest ratios: image

2) part of my problem solving for my original post involved me using ChatGPT. It suggested that I use pivot_wider() as opposed to spread() as it claimed spread() was superseded by pivot_wider(); are you familiar with why that might be or what added benefit one may have over the other? Or perhaps different scenarios where one would be more useful? I ended up using pivot_wider just because it seemed to work, so I didn't bother changing back to spread(). Any thoughts?

swest235 commented 5 months ago

@AntJam-Howell I've knitted my file and I get these incremental loading images - it is just unsightly, is there a way to remove this? I didn't have these appear in my previous course. image

AntJam-Howell commented 5 months ago

Hi @swest235 Glad you were able to resolve some of your earlier troubles. To turn off the loading of the data message upon knitting, you can turn off warning messages in the R chunk, something like:

```{r, echo=TRUE, eval=TRUE, warning=FALSE}

vars.to.use <- c( HHIncome = "B19049_001",
                  HouseValue = "B25077_001" )

CenDF <- get_acs( geography="county", 
                  year=2017, 
                  survey="acs5",
                  variables=vars.to.use,
                  geometry=TRUE,
                  shift_geo=TRUE )