geanders / noaastormevents_paper

0 stars 0 forks source link

Draft Hazards section #6

Open geanders opened 4 years ago

geanders commented 4 years ago

We've got some really nice content in the "hazard bias" section. Here's my advice for how we can move some things around to make a nice flow for this section:

  1. Let's start out by defining what a hazard is. I think that there's been some controversy about this definition, so we should definitely cite a paper or book that gives the same definition we're giving. We could see if there's a definition of "hazard", for example, in the Gall et al. paper. Susan Cutter has also done some great research on hazards and may have a paper with a good definition. We could check this paper by her: https://www.pnas.org/content/105/7/2301.short. Here's another by her that could also be helpful. This paper has a bit of a discussion about the controversy about the definition and might have some good leads to other related papers.
  2. This leads us nicely into describing the "event types" that show up in the Storm Events database. I think that these might line up well with different types of hazards? If so, we could explain that, and then also explain that, in a few cases, there are different "event types" for different intensities of the same type of hazard---for example, "Strong Wind" versus "High Wind" or "Cold/Wind Chill" versus "Extreme Cold/Wind Chill". We should say the total number of event types that can be recorded in NOAA Storm Events (based on this it looks like there are 55). Then, we can have the table where we give the number of events reported by type in 2019. We can add a few sentences saying that, in 2019 [x] total events were reported, of [x] different event types. The most frequent type of events reported were Thunderstorm Wind, which had over twice as many reports as the second most frequent, Hail, which in turn had almost twice as many events reported as the next most common few---Flood, Flash Flood, and Winter Weather. The least common reported events included Volcanic Ashfall, Sleet, Dense Smoke, Sneakerwave, Seiche, and Marine Tropical Depression. (As a note, Sleet might be a good example of hazard bias in terms of number of events reported---it is definitely more common than is showing up here).
  3. Then, we can move into the idea of "episodes" in the database, and have our discussion of that, as well as some of the associated plots, like the distribution of events for the "top" episodes in 2019 and the clustering of event types within an episode. We should discuss how, in some cases, the events in an episode will all be the same type of hazard (e.g., cold), but at different intensities, and in these cases the collection of events might be one per county/zone over the affected area, but showing the spatial range and varying intensity of the event (this is likely why we see the events with different intensities of the same hazard clustered together, like heat/excessive heat and cold/extreme cold). In other cases, a larger synoptic weather system might bring different hazards, and so sometimes an episode will include different hazards (e.g., thunderstorm wind and hail, which cluster together). In these cases, we'd probably get lots of counties/zones in the affected area that report more than one event for the episode.
  4. We can then talk about the idea of "hazard bias", and how some types of hazards might have different reporting standards and reporting probabilities (i.e., probability of it making it in the database if it happened) than others. We might want to distinguish two pathways/implications of hazard bias. First, I imagine that it's often the case that the probability of an event being recorded in the database if it happens is one important pathway. This might mean that, for certain hazard types, the database is underreporting them, and so you'd be undercounting them based on the database. A second pathway/implication might be that the quality and quantity of information provided for a recorded event varies by event type. For example, you might get much more reliable estimates of damage for certain types of events than others, or longer and more helpful descriptions. The rest of the section can move into describing some mechanisms for how this bias can arise, with examples/illustrations from the data from 2019.
  5. One important point example of a mechanism for hazard bias will be that there's variation in who reports different types of events in this database, and we can illustrate that with the table of number of events reported by source of event report as well as the figure with the numbers of reported events by source. We could add some discussion for these different sources on calibration, agreement across reports (e.g., across monitoring equipment for an automated weather station system, versus across people for volunteer reports), and likelihood of "catching" an event based on source (for example, an automated system would probably "see" any event that happens at the spot the monitor's located, while an event that's usually reported by people would miss anything that happens when a person isn't around or can't see it (e.g., at night, in a sparsely populated area). We might want to make a comparison on how the number of reported events for an event type is associated with the proportion of events reported by an automated system (e.g., Mesonet, ASOS, AWOS, River/Stream Gage) versus by media or a person (e.g., media, 911 call center, Storm Chaser, Amateur Radio, Public). It looks like some of our very rare events (Sneakerwave, for example) are only reported by sources like broadcast media, fire department/rescue, and newspaper.
  6. Another important mechanism for hazard bias might be the standards for when different types of hazards are reported. We have the examples on frost related to this, although we probably want to save that for the temporal bias and find a different example to use here. If you look at that event in the manual, then it gives some more on when frost should be reported, and I think that it's only during the growing season. I think that for other event types, there are also restrictions. So, for example, I think every tornado is reported, but for some event types, it's only reported if it causes deaths, injuries, or damages. We could look through that manual to find some more examples, and perhaps compare the frequencies reported in Storm Events to how common events are based on climatological-type studies. For example, we could try to find a source that says how often you'd really expect Sneakerwaves and Sleet to happen, and then see if there's something in the reporting standards for Storm Events that is making it so we only see a few in the database. By contrast, I'm guessing that the standards for reporting might be more open for the events that show up a lot, like Thunderstorm Wind and Hail.
  7. If we can think of other mechanisms, we could add them here... One may be that the detection technology level may differ by hazard---we may have a better process for ID-ing cases of one hazard/event type compared to another (this is a similar idea to how changes in technology over time might lead to temporal bias). I wonder if some of our rare events (Sneakerwave, Seiche, Sleet) might be examples of spots where we don't have great technology yet to detect them (or not a large enough network of monitors even if we do have the technology)?
  8. We could end by discussing the implications that hazard bias might have in analysis. For example, would it only be an issue when comparing frequencies of different types of events? Could it lead to confounding in a single-hazard study that's looking at disaster impacts? It would certainly result in undercounts of certain types of hazards, if you're trying to characterize how frequently a type of hazard tends to happen. Would there be implications from information bias, particularly measurement error? (We could save this paragraph to draft for after we've drafted the rest of this section, and just put in a placeholder for now.)

I recommend you look through the nice paragraphs that you've drafted already and move these around to fit into this framework, and then add in any of the notes or snippets next, and finally we can see where we need more and draft some text to fill those spots.

theresekon commented 4 years ago

I have started sorting through these comments and copying in my previously written info on hazard bias. I put this on my draft document because the outline is getting very crowded.

I'll push up the most recent copy of this draft today too.