geanders / noaastormevents_paper

0 stars 0 forks source link

Consider adding a column for county vs. forecast zone reporting to table with numbers of events per year #7

Open geanders opened 4 years ago

geanders commented 4 years ago

We might want to include a discussion of how the granularity/spatial scale of information about an event varies by hazard type because some are reported by county and some by forecast zone (and this also affects whether things like fatality counts are reported for the event or the episode as a whole, it sounds like). In this case, it would be helpful for us to add a column with "reporting area" to the table we have with the number of events by type in 2019.

Here's how I would suggest starting on that edit:

  1. Make a simple dataframe in R (by hand, rather than reading data in from somewhere else) that lists each event type and then whether they're reported by forecast zone or county. We have a table in the vignette that covers that, but I think this document gives which should be used for each, based on a "C" (county) or "Z" after the name in the table of contents. I like using the tribble function from dplyr for making dataframes like this, because you can enter info by row instead of by column. For example, to do the first three event types, you could run:
library(dplyr)
event_reporting_areas <- tribble(
  ~ event_type, ~ reporting_area,
  "Astronomical Low Tide", "Forecast Zone", 
  "Avalanche", "Forecast Zone", 
  "Blizzard", "Forecast Zone"
)
  1. Once you have this separate dataset, you can use full_join from the dplyr package to merge this data with the output from the code that we have currently for the table:
events_2019 %>%
  group_by(EVENT_TYPE) %>%
  summarize(N = n()) %>%
  arrange(desc(N)) %>%
  mutate(N = prettyNum(N, big.mark = ",")) %>%
  knitr::kable(col.names = c("Event type", "Number of events in 2019"))

You'd want to add this in the pipe before getting to knitr::kable---it might be best do it as early as right before the line that starts group_by, as then I think that we could also get rows for the event types with none reported in 2019, which would be nice (we may have to play around with the code some to get this to work to add those "0" rows). Then, you'll need to edit the knitr::kable line to add in a column name for "Reporting area" (or whatever we want to name that column).

theresekon commented 4 years ago

This is the code I have written for this so far:

library(dplyr)

event_reporting_areas <- tribble(
  ~ event_type, ~ reporting_area,
  "Astronomical Low Tide", "Forecast Zone", 
  "Avalanche", "Forecast Zone", 
  "Blizzard", "Forecast Zone",
  "Coastal Flood", "Forecast Zone",
  "Cold/Wind Chill", "Forecast Zone", 
  "Debris Flow", "County",
  "Dense Fog", "Forecast Zone", 
  "Dense Smoke", "Forecast Zone",
  "Drought", "Forecast Zone",
  "Dust Devil", "County",
  "Dust Storm", "Forecast Zone",
  "Excessive Heat", "Forecast Zone",
  "Extreme Cold/Wind Chill", "Forecast Zone",
  "Flash Flood", "County",
  "Flood", "County",
  "Freezing Fog", "Forecast Zone",
  "Frost/Freeze", "Forecast Zone",
  "Funnel Cloud", "County",
  "Hail", "County",
  "Heat", "Forecast Zone",
  "Heavy Rain", "County",
  "Heavy Snow", "Forecast Zone",
  "High Surf", "Forecast Zone",
  "High Wind", "Forecast Zone",
  "Hurricane/Typhoon", "Forecast Zone",
  "Ice Storm", "Forecast Zone",
  "Lakeshore Flood", "Forecast Zone",
  "Lake-Effect Snow", "Forecast Zone",
  "Lightning", "County",
  "Marine Dense Fog", "Marine Zone",
  "Marine Hail", "Marine Zone",
  "Marine Heavy Freezing Spray", "Marine Zone",
  "Marine High Wind", "Marine Zone", 
  "Marine Hurricane/Typhoon", "Marine Zone", 
  "Marine Lightning", "Marine Zone", 
  "Marine Strong Wind", "Marine Zone", 
  "Marine Thunderstorm Wind", "Marine Zone", 
  "Marine Tropical Depression", "Marine Zone", 
  "Marine Tropical Storm", "Marine Zone", 
  "Rip Current", "Forecast Zone", 
  "Seiche", "Forecast Zone", 
  "Sleet", "Forecast Zone",
  "Sneaker Wave", "Forecast Zone", 
  "Storm Surge/Tide", "Forecast Zone", 
  "Strong Wind", "Forecast Zone", 
  "Thunderstorm Wind", "County", 
  "Tornado", "County", 
  "Tropical Depression", "Forecast Zone", 
  "Tropical Storm", "Forecast Zone", 
  "Tsunami", "Forecast Zone", 
  "Volcanic Ash", "Forecast Zone", 
  "Waterspout", "Marine Zone", 
  "Wildfire", "Forecast Zone", 
  "Winter Storm", "Forecast Zone", 
  "Winter Weather", "Forecast Zone"
)

event_type_2019 <-events_2019 %>%
  group_by(EVENT_TYPE) %>%
  summarize(N = n()) %>%
  arrange(desc(N)) %>%
  mutate(N = prettyNum(N, big.mark = ",")) %>%
  knitr::kable(col.names = c("Event type", "Number of events in 2019"))

events_2019 %>%
  full_join(event_reporting_areas, event_type_2019, 
            copy = TRUE) %>% 
  group_by(EVENT_TYPE) %>%
  summarize(N = n()) %>%
  arrange(desc(N)) %>%
  mutate(N = prettyNum(N, big.mark = ",")) %>%
  knitr::kable(col.names = c("Event type", "Number of events in 2019"))

I'm getting multiple different error messages when trying to run this. When I run just

full_join(event_reporting_areas, event_type_2019, 
            copy = TRUE)

I get: "Error in as.data.frame.default(y) : cannot coerce class ‘"knitr_kable"’ to a data.frame"

When I run:

events_2019 %>%
  full_join(event_reporting_areas, event_type_2019, 
            copy = TRUE) %>% 
  group_by(EVENT_TYPE) %>%
  summarize(N = n()) %>%
  arrange(desc(N)) %>%
  mutate(N = prettyNum(N, big.mark = ",")) %>%
  knitr::kable(col.names = c("Event type", "Number of events in 2019"))

I get "Error: Join columns must be present in data."