Pooled results contain Delta values that are sometimes too small

dholstius commented 1 year ago

Overview

Thank you for a great product! Our group at BAAQMD is increasingly relying on it in our regulatory work at a regional level.

We have recently been conducting some "unit" runs with Delta set uniformly equal to 0.1 µg/m³ across the domain. When inspecting the exported pooled results, we were surprised to find reported values for Delta that were smaller than 0.1 µg/m³ for some grid cells.

Hypothesis

At first we thought this might be related to spatial slicing of grid cells, since it tended to appear in cells toward the margins of our domain. But it differed by HIF, and did not appear exclusively in edge cells. (See map below.)

Upon closer inspection, our best hypothesis is that Delta is somehow being sliced and re-aggregated according to age brackets relevant to the given HIF, and that in the process, a fraction of Delta is counted as zero if the population for a given bracket is zero. (Credit to @yfang-air.)

Taking lung cancer (Gharibvand et al.) as an illustrative example, there are seven relevant age brackets. If one of them has zero population in a given cell, then we find that the pooled-output Delta for that cell will be 0.08571429, or 6/7 of what it should be. If the population counts for two of the seven brackets are zero, then the output Delta will be 5/7 of what it should be, and so on.

This is the case for other HIFs as well. If there are two relevant age brackets, then most of the time the reported Delta will be as expected, but for some cells it will be 1/2 of that.

In the least-populated cells, the chance of one or more age brackets having zero population will be larger. Conditional on the total relevant population for a cell, when that population is sliced into more and finer age brackets, there will also be, intuitively, a higher probability that at least one of the resulting slices will contain zero. So the likelihood of observing any bias at all should be higher for HIFs with that kind of slicing (like Gharibvand et al). This is also consistent with our observations that the degree of overall bias varies by HIF.

However, because most cells seem to be unaffected, across our entire domain the overall bias in the population-weighted mean Delta is small: about -0.3% for the most-affected HIF, which is for lung cancer (Gharibvand et al.). We are using a 1km grid resolution.

Suggested Course of Action

The codebase itself is challenging for us to understand or bisect, so we're hoping that you could:

weigh in on whether this hypothesis makes sense, given your understanding of the codebase;
advise whether any other variables in pooled results may be affected (Baseline, Mean, Population, Point Estimate, etc.), or whether we should expect this bug to be confined to Delta; and/or
(ideally) confirm or disconfirm through your own testing and/or efforts to replicate.

Thank you again for a great product. Please let us know if we can help clarify anything related to this.

haroman commented 1 year ago

Hi David,

We are happy to look into this. Would you mind posting the .cfgrx and .apvrx files you used to generate these results? Thanks! - Henry

yfang-air commented 1 year ago

Hi, Henry:

The original files we have are too big in sizes because we ran many HIFs for millions of grids. However, I ran a sample case just keeping two health endpoints (Lung cancer and Mortality, and Lung Cancer is the one that has issues as shown in David's post).

The corresponding BenMAP files were uploaded to dropbox. Hopefully you can see them here - https://www.dropbox.com/sh/zcp95r05u34ip1u/AAAG-0Ytu5V3ih6HvTMKORKGa?dl=0 . If not, please let us know.

FYI, basically, our runs were built upon the PM2.5 configuration recommended by EPA - https://www.epa.gov/sites/default/files/2021-04/u.s._epa_approach_for_quantifying_and_valuing_pm_effects_0.zip

Thank you for assisting us!

Yuanyuan

haroman commented 1 year ago

I am not able to access this link. If you give me your email address, I can set up a OneDrive folder.

yfang-air commented 1 year ago

That is strange. It should be viewable to anyone with the link. And sure, here is my email - yfang@baaqmd.gov . Thank you again:-)

haroman commented 1 year ago

Ok, I shared a OneDrive link with you. Let me know if you don't get it.

yfang-air commented 1 year ago

Thanks Henry! I uploaded the files to the folder you shared.

haroman commented 1 year ago

Hi Yuanyuan,

Is your 1 km grid constrained to the SF Bay area? If so, would be good to get a copy of the shapefile for that as well. Thank you!

yfang-air commented 1 year ago

Hi, Henry:

Yes, we are using our CMAQ 1km grids over the Bay Area. I was just wondering yesterday whether you would need the grid definition as population data files.

I uploaded the CMAQ 1 km grid shapefile (see /Shapefile) as well as the population data (created using PopGrid, see _/Pop_on_CMAQ_1km_grid_fromPopGrid) to onedrive. Please take a look and let me know if you have any additional questions/requests.

Thank you! Yuanyuan

haroman commented 1 year ago

Hi David and Yuanyuan,

We have confirmed your hypothesis that BenMAP is inappropriately averaging the deltas across age groups when implementing the pooling step. The good news is that we found no evidence that the actual health incidence calculations are affected by this bug; it only affects the delta that gets reported along with pooled results. We are working on a proposed fix that will generate more appropriate delta estimates that will be both appropriate for the pooling approach chosen, and will work whether the results are processed at the original grid scale or aggregated to larger polygons.

yfang-air commented 1 year ago

Thank you so much Henry! I agree with you that the actual health incidence should be correct. I am less familiar with the other variables in BenMAP's output (e.g., baseline, percentage of baseline), though. What do you think of those variables?

haroman commented 1 year ago

Based on our review of the code, other variables such as baseline, percentage of baseline are recalculated using the pooled output, so they should be fine. There is the potential for error in the baseline value if a user were to perform sum dependent pooling of studies whose age groups or endpoints overlap, but if that overlap is substantial, use of sum dependent would be inappropriate anyway.

BenMAPCE / BenMAP-CE