Update Social Cost of Health Impacts by Pollutant

robbieorvis commented 4 years ago

EPA released an updated set of benefit-per-ton estimates (prior to Trump admin) that updates prior estimates with new health, economic, and energy sector data. New benefits are significantly lower than before. New data structure is better aligned to model structure.

robbieorvis commented 4 years ago

Done in https://github.com/Energy-Innovation/eps-us/commit/2f7207938ccfe5fefec745b21ebd062002b5e19b

jrissman commented 4 years ago

Hi @robbieorvis - Thanks so much for updating this. I just wanted to check to see if you verified that the new EPA source uses the same Value of a Statistical Life (VSL) figure, in inflation-adjusted dollars, as the old benefit-per-ton numbers did. If the EPA chose to use a different VSL value, we have to update add-outputs/VoaSL to keep it in sync with this variable.

Also, I notice the Excel file contains an outdated note on the "About" tab that says:

Although the LULUCF sector doesn't emit the types of pollutants that are assigned health impacts in the source document above, we nonetheless include it here and produce a CSV output file, in case the LULUCF sector is changed in the future to add some of the pollutant types for which health impacts are calculated.

The LULUCF sector has emitted the types of pollutants that affect mortality for some time (e.g. ever since we updated to model things like wildfires and crop residue burning). They were being calculated correctly. The only problem is that the note on the "About" tab should be removed, as that LULUCF output sheet is strictly required, not just hedging against future model capabilities.

jrissman commented 4 years ago

Actually, you might want to use "area sources" instead of "non-road mobile sources" to represent LULUCF emissions.

robbieorvis commented 4 years ago

Hi Jeff, okay I will remove that text. I assigned LULUCF to non-road mobile, assuming most of it would be logging and other clearing equipment (by the way, if in the future we want more detailed mortality mapping, the new data is quite a bit more detailed than what we had before).

Regarding the Value of a Statistical Life, yes I verified it uses the same value. (see https://www.epa.gov/sites/production/files/2018-02/documents/sourceapportionmentbpttsd_2018.pdf for the full updated methodology from EPA)

robbieorvis commented 4 years ago

FYI, I also see the EPA has the full data for 2020 and 2025 in the referenced report, but I just used 2016 to 2030 and interpolated. I think it's okay, but we could include the additional data if desired.

robbieorvis commented 4 years ago

Last note for now, aside from ironing out the area vs non-road mobile sources; we could also eventually add morbidity if we want, since it's included in the new dataset.

jrissman commented 4 years ago

I guess it's really split by policy. For a policy like afforestation/reforestation, it's probably mostly non-road mobile. But for a policy like "avoid peat fires," it's a change in an area source. But we don't break out these health multipliers by policy. So I think it's fine to go with whichever one.

Adding morbidity is definitely interesting. Is there any way to translate it into anything other than dollars, like QALYs or number of people sickened?

robbieorvis commented 4 years ago

FYI, apparently the benefit-per-ton values include morbidity, but in terms of dollar impacts the mortality estimates are more than 98% of the dollar value, i.e., we may overstate premature deaths by 2%, but EPA notes that we should also only be using two significant figures anyway, so this is within the noise.

I have to look more into your question on QALYs. To do that, we need to break out the incidence rate per ton instead of the cost per ton and add a bit more structure in the model I think. But, we should be able to ID some interesting things, avoided hospital visits, savings on healthcare, and perhaps QALYs.

robbieorvis commented 4 years ago

Okay, for morbidity, we have cost functions that are defined in Table 5-12 of this document: https://www3.epa.gov/ttn/ecas/regdata/RIAs/matsriafinal.pdf

I don't think we can use QALYs - we would need a lot more data to do that.

Using the data we now have we could do a few things:

1) Continue our current approach of using benefit-per-ton estimates to estimate social costs and deaths. We know these include about 2% non-mortality (i.e. morbidity) costs. We should probably address the over-precision in the model, since EPA specifically says to only use two significant figures when using this data.

2) Modify the current structure to use incidence-per-ton rates, which we now have, to get more precision on premature deaths.

3) If we go route 2, we could also add incidence rates for morbidity related outcomes and quantify the impacts (maybe - some are stratified by age group). We could report things like change in hospital stays, cases of asthma, etc... Major downside here is that I think this data will be incredibly hard to find for other countries, so unless we intend to use BenMAP to find these outcomes in other regions, it may not make sense to do this, although some of the outcomes are quite interesting (but could also be done externally).

@jrissman - let me know what you think.

jrissman commented 4 years ago

I'd be very happy to extend the model structure to provide more detail on public health impacts. I think that's an area where the EPS's current outputs are meager and which could be a lot better.

It's true that a lot of that data won't be available for all regions, but since we can simply exclude particular output graphs for regions where the data aren't available, I don't really see that as a problem. It's not like energy use data, which we truly need for all regions. It's just an output metric, so it's easy to omit wherever we don't have it.

On (2), when you say we could get "more precision on premature deaths," do you mean breaking down avoided premature deaths by cause of death, by age group, by race, by income level, or by some other metric? We could display any of these as a stacked area graph instead of (or in addition to) the current line graph, which just shows avoided deaths all lumped together. Since our model is geographically aggregated, I'd feel more comfortable with a breakdown by cause of death (since presumably people everywhere will react roughly similarly to air pollution) than with a breakdown by race or income (because different places have dramatically different percentages of people of different races and income levels, and race/income is correlated with pollutant exposure). My comfort with an age breakout falls somewhere in the middle - I think age distribution varies geographically within the U.S. (for instance, Florida has more older people) but not so much as does race or income, so my comfort with data accuracy would be something like: cause of death > age > race or income.

For premature deaths, I think it makes sense to reduce the benefit-per-ton values by the 2% you found. I know it's within the margin of error, but if we know that they include 2% morbidity, we ought to make the adjustment to fix it (and for anyone who reviews the Excel file, seeing us make the adjustment will preempt a question about whether we adjusted for morbidity impacts).

On your (3), if morbidity impacts are in the noise, is a breakout of morbidity impacts by type meaningful? I guess it is meaningful, because the only reason it was a small percentage of overall monetized health impacts was because of the high VSL assigned to premature deaths. It doesn't mean we can't meaningfully link air pollution to other health outcomes. Would we just be providing multipliers that convert emissions of different types to different morbidity outcomes (non-fatal conditions), similarly to how we do today with premature mortality?

The number of significant figures can be controlled in Vensim through the Quantization Size variables. We have a "Quantization Size of Human Lives" variable, which is currently hard-coded in Vensim as 1, since I never anticipated anyone would want to change it. Quantization rounds down to the next multiple of the quantization size, so if we set it to 100 and the model estimates 99 avoided deaths, this will show up as zero. My feeling is that the web app should be a window into the raw calculated output, and reducing the significant figures is something we should do in articles and reports (e.g. we could write "about 20,000 avoided deaths per year by 2050"), but not in the web interface itself. It's not as if mortality impacts are unique in this way - we don't precisely know the uncertainty associated with any of the emissions or financial projections, so we round them all based on the magnitude of the Y axis for nice graph display, rather than reflecting known uncertainties in the underlying data. I think this could be a real can of worms and offers low value for our time, so I'd rather leave the job of deciding how to round the data to folks that are using our data, rather than do it ourselves in the web interface.

robbieorvis commented 4 years ago

Regarding more precision, I'm mainly referring to the fact that the new dataset has much finer detail on risk-per-ton estimates in terms of the sources of emissions:

Locomotives and marine vessels Area sources Cement kilns Coke ovens Electric arc furnaces Electricity Generating Units Ferroalloy facilities Industrial point sources Integrated iron and steel facilities Iron and Steel Non-road mobile sources Ocean-going vessels On-road mobile sources Pulp and paper facilities Refineries Residential wood combustion Taconite mines

We could more more finely map these onto emissions sources in the EPS as a starting point, which improve precision.

Separately, we now have the separate incidence rate by pollutant by sector (aside from just the monetary damages) so we could use the incidence rates directly and calculate the economic impact directly downstream from that. That would also, for example, let us look specifically at reduced mortality from changes in the technology/energy mix in certain sectors (maybe unnecessary, but possible). This would also let us separate out mortality for morbidity, which would make me slightly more comfortable about how we present that data.

Then, we also have incidence rates by sector and pollutant for morbidity, broken out many ways:

Respiratory emergency room visits Acute bronchitis Lower respiratory symptoms Upper respiratory symptoms Minor Restricted Activity Days Work loss days Asthma exacerbation Cardiovascular hospital admissions Respiratory hospital admissions Non-fatal heart attacks (Peters) Non-fatal heart attacks (All others)

So any of these could now be calculated at the technology/pollutant/sector level and reported as an output. Converting to costs can be done using the data in the previously linked URL (this is what is most likely to vary country to country).

When I say in the noise, what I mean is the the economic impact is in the noise (e.g. 2% of total benefits including premature mortality) but I still think the numbers themselves might be meaningful. Partly this is due to the differences in costs... a hospital visit might be $100 but an avoided death is $8 million, so you need 80,000 avoided hospital visits to arrive at the same savings as one avoided death. But 80,000 avoided hospital visits is obviously very compelling on its own!

Here's is EPA's precise text on rounding and sig-figs... I'll leave it to you to decide what makes sense:

When using these benefit per ton estimates in analyses, care should be taken to not overstate the accuracy of the total benefits estimates or estimates of avoided incidence. For this reason, it is EPA practice to round total benefits estimates to two significant digits and to round estimates of avoided incidence to the nearest whole number.

After reviewing everything, my proposed updates would be:

1) Switch to using incidences, not $ damages/ton, to estimate impact to premature mortality 2) Increase accuracy of estimates by applying more detailed incidence rates based on data availability: a) Transport: different vehicle types b) Electricity: same values across plant types c) Industry: sector-specific data where possible, otherwise choose between area and industrial point sources d) Buildings: area sources (there is a breakout for residential wood burning but I don't think we need to worry about that) 3) Then, calculate monetized benefits, where we can sum avoided premature deaths and multiply by VSL 4) Estimate morbidity outcomes using similar data. We could also do this with the same level of accuracy as with mortality. 5) (optional) Estimate monetized morbidity outcomes. This is a bit trickier because of the age distribution issue. It is also likely to be a tiny fraction of total social benefits (<2% compared to premature mortality, and probably <1% when adding in carbon benefits), so I don't really think it's worth doing. The more impactful numbers are the actual avoided morbidity outcomes.

2) Add new capability to estimate morbidity impacts using incidence rates 3) Use VSL value in model to convert

jrissman commented 4 years ago

Okay, I see. I do think mapping emissions sources onto mortality/morbidity incidence with greater precision would improve the precision of health impacts data for the U.S., and now I better understand the concern about data availability for foreign countries. Scaled values seem more acceptable when applied to an aggregate figure than they do when applied to many broken-out source types.

Is there no breakdown of mortality by cause? The main causes of mortality from air pollution are heart disease, stroke, chronic obstructive pulmonary disease, lung cancer, and acute respiratory infections. I'm a little surprised because it looks like they break out morbidity into many different diseases / health impacts, so I would have thought they might have some sort of breakout for mortality too. There may be other places that have that sort of breakdown - I recall the Health Effects Institute may have published some data on this, but it's not easy to find on their website. Or if we don't have per-disease incidence rates, but we have the percent of air pollution deaths caused by different diseases, we could just multiply the final avoided deaths count by the percent breakdown by disease.

I keep asking about this in part because end users might not notice tweaks to slightly improve the accuracy of the mortality calculations if the final output graph looks pretty much the same (a single line with avoided deaths). A stacked area graph of mortality by disease / cause of death (or age, or something) would make it more obvious to the end user that we've added value to our mortality calculations.

I suppose we could also do a stacked area graph of mortality caused by emissions from each sector, so at least that's something we could add. (We could even add it today, without the other enhancements we're discussing.)

Yes, if they report incidence rates in terms of emissions per incident (death or disease), I agree that we'd rather calculate that directly, then apply the VSL to monetize afterward. We only went from monetized amounts to deaths because that reflects the data we had when we made this feature back in EPS 1.0.1.

I agree with omitting item (5) from your list above.

To me, item (7) in your list above looks like it was already included in item (3).

How about this shortened version of your list?

Switch to incidence rates, not $ damages/ton, to estimate premature mortality. Ideally, we would use incidence of mortality broken out by cause of mortality, or break it down afterward using multipliers. Also add a graph breaking down mortality by sector that caused the emissions.
Similarly, use incidence rates to estimate morbidity, broken out by type of health impact (making some reasonable groupings from your list above, like (1) lost work days, (2) emergency room visits, (3) respiratory problems, and (4) non-respiratory problems).
Use VSL to monetize premature mortality, but not morbidity. Add it to monetized climate damages for a societal damages metric, as we do today.
Increase the accuracy of estimates by applying more detailed incidence rates based on emissions source, per your item (2) above.

jrissman commented 4 years ago

Incidentally, our new comms deputy director, Sarah, used to work on this sort of thing and is really interested in us expanding our health impacts outputs, so she might know of some relevant resources.

robbieorvis commented 4 years ago

All the above looks good. The report does not breakout mortality by cause (though in theory one could replicate the study and probably find it that way using BenMap). The incidence rates are based on the Krewski et al. (2009) study here. You might be able to reasonably break it apart that way (see tables 3 and 6), except that this is a dose-response relationship and I'm not sure if it's totally fair to just apply multipliers. Sarah might know better.

jrissman commented 4 years ago

The dose-response thing is something I did in my grad school thesis (see this paper, section 3.6). I'm a little rusty on it, but I'm sure I could get out the calculations that went into that paper and refresh my memory of how to use those figures. The Krewski et al. study looks like a great resource.

ssonniaa commented 4 years ago

In addition to Sarah, Bruce has also worked extensively on health impacts and we should likely invite his review of this update.

jrissman commented 4 years ago

Commit 9891ab6 implements the input data and structure to calculate the change in incidence of 11 different health outcomes. They are:

Premature mortality
Respiratory emergency room visits
Acute bronchitis
Lower respiratory symptoms
Upper respiratory symptoms
Minor restricted activity days
Work loss days
Asthma exacerbation
Cardiovascular hospital admissions
Respiratory hospital admissions
Non-fatal heart attacks

Premature mortality we had before. The others are new. I haven't added any new output graphs yet - I'm going to do that before I close this issue. I don't want all 11 outcomes to be graphed separately - I think some groupings make sense. I'm thinking about this set of graphs:

Premature mortality (1)
Lost work days (7)
Asthma attacks (8) - this term is better-known than "exacerbation" and Mayo Clinic indicates the terms mean the same thing
Respiratory symptoms (3 + 4 + 5)
Non-fatal heart attacks (11)
ER visits (2)
Hospital admissions (9+10)

This omits "Minor restricted activity days" (the days when the government says sensitive groups should avoid certain activities like exercising outside, multiplied by the number of people to whom the restriction applies), because it's a comparatively minor outcome, it doesn't directly affect health, and a typical EPS user would have no idea what this metric is. All of the other 10 outcomes are used in at least one of the proposed graphs.

jrissman commented 4 years ago

I was wrong about the meaning of "minor restricted activity days" - it refers to actual days when people restricted their activities (typically, rested or did less strenuous things), but not to the point of missing work. It's not just how many people would be affected by a government guideline to restrict outdoor or other activities for sensitive groups. For more on this, see here. Accordingly, I'll include a graph of them as well, so all of the public health metrics will be included in the web app output graphs.

jrissman commented 4 years ago

Completed in commits 9891ab6 and ee1c9f1.

EnergyInnovation / eps-us

Update Social Cost of Health Impacts by Pollutant #67