Closed huq23 closed 1 month ago
Hi @huq23. I can read them in alright, but they are not properly formated to be read by a computer, but rather are made to be read by a human as a final step. If you are downloading these from Social Explorer, you need to be sure to pick the correct option to get them downloaded as a flat CSV file. I can't access social exporer right now to walk you through it, but its the same process as we used last term.
Hello @huq23. I see that you are trying to use the data that you have but this is leading to some really difficult and awkward coding because these files are not meant for machine reading. Please remove these files and get data that is machine readable.
#replace this with your API key
api_key <- "PUT API KEY HERE"
# choose starting and ending year
start_year <- 2017
end_year <- 2019
url_base <- "https://api.usa.gov/crime/fbi/cde/estimate/state/{state_abbr}/{type}?from={start_year}&to={end_year}&API_KEY={api_key}"
crime_data <- as_tibble(expand.grid(state_abbr=c(state.abb, "DC"),
type=c("violent-crime","property-crime")))
urls <- crime_data |> glue_data(url_base)
rates <- map_dfr(urls, function(url) {
results <- (curl::curl(url) %>% read_html() %>% html_nodes("p") %>%
html_text %>% fromJSON)$results[1] %>%
bind_rows()
return(results)
})
crime_data <- bind_cols(crime_data, rates)
# now reshape
crime_data <- crime_data |>
pivot_longer(cols = c(`2017`,`2018`,`2019`),
names_to = "year", values_to = "rate")
# now reshape again to get violent and property on the same line
crime_data <- crime_data |>
mutate(type = str_remove(type, "-crime")) |>
pivot_wider(id_cols = c(state_abbr, year), names_from = type, values_from = rate)
You will have to install and load the curl
, glue
, and stringr
packages in addition to the usual tidyverse.
@AaronGullickson : Dear Professor, thank you for the codes to help modify the crime data. For the social explorer data, since I am focusing the years from 2017 to 2019, I have individually downloaded data for 2017, 2018, and 2019. I have also modified my earlier codes and have tried to make them simpler and joined them accordingly. Could you please check my codes and let me know if I am on the right track?
It looks much better. It looks like you are still using the xlsx results for the ACS rather than the more machine readable CSV results. At this point, I wouldn't worry about it, but you have done more work than you should need to do to make that data format work for you.
@AaronGullickson Professor, I am facing a problem while reading the data in R in the organize_data.qmd file. I do not know why I am unable to read the data using read_xlsx function. Please help me to in this regard.