NIEHS / beethoven

BEETHOVEN is: Building an Extensible, rEproducible, Test-driven, Harmonized, Open-source, Versioned, ENsemble model for air quality
https://niehs.github.io/beethoven/
Other
4 stars 0 forks source link

Process ecoregion data #211

Closed sigmafelix closed 9 months ago

sigmafelix commented 9 months ago

This is a part of "Climate/Eco-Regions" entry in #186.


Ecoregion variables are not listed in our covariate table. However, I suppose the ecoregion variables need to be calculated from HUC-12 Ecoregion Indicators? The files are available at the link, which are divided into separate tables in ten Ecoregions. Data are reusable from the PrestoGP_Pesticides project directory in DDN. Literature reads that the possible pathway from Ecoregion to PM2.5 is [Ecoregion <-> vegetation types --> wildfire risk --> PM2.5]. Thus, it would make sense to include upstream factors on vegetation types that are plausible to be associated with PM2.5 measurements.

This issue is another agendum for next week's meeting. @Spatiotemporal-Exposures-and-Toxicology

Ecoregion

kyle-messier commented 9 months ago

@sigmafelix I do think the EPA ecoregions could have some additional value on top of the Koppen climate regions you've claculated. I also think there are shapefiles for Ecoregions 3 or 4 that would be easy to spatial join directly with our data and no need to go through the HUC data. Ecoregions level 2 (about 15 classes) could be sufficient. Adding level 3 may be nice but introduces 80+ additional indicator varibles - unless you want to calculate fraction of ecoregion variables.

sigmafelix commented 9 months ago

@Spatiotemporal-Exposures-and-Toxicology Thank you for the comment. I will use Ecoregion levels 3 and 4 to perform spatial joins to extract indicator variables for Ecoregions.

sigmafelix commented 9 months ago

A site in Portland, ME is outside the Ecoregion polygons (below). I will make it snapped to the nearest polygon then do spatial join. Ecoregion binary variables are not listed in the covariate table; I will add them to the table as well.

image

sigmafelix commented 9 months ago

@Spatiotemporal-Exposures-and-Toxicology

No site will get a level 2 ecoregion "13.1 UPPER GILA MOUNTAINS". I think all-zero values in a category will not contribute to predict at points in the region even if variable selection or prescreening will be performed before the model fitting. Do we keep all-zero fields like this case later?

sigmafelix commented 9 months ago

The calculation is completed. The results do not include all-zero categories at the moment.