tcarleton / stagg

Spatiotemporal Aggregation for Climate Data
Other
9 stars 0 forks source link

Secondary weights not used because of NA values issue when running overlay_weights() #25

Open giovannibrocca opened 11 months ago

giovannibrocca commented 11 months ago

I am trying to aggregate ERA5 surface temperatures in June 2000 at the country level weighting by population, using the data described in the code. However, when running overlay_weights(), I get the following error:

"Warning: Warning: some of the secondary weights are NA, meaning weights cannot be calculated and area-weights will be returned"

This happens both in the case in which I do not reclassify the NA values present in the population dataset and in the case in which I do so.

My code is:

# Loading population count data from https://sedac.ciesin.columbia.edu/data/set/gpw-v4-population-count-rev11/data-download
population_data = brick(".../1_Data/world_example/2pt5_world_population_count/gpw_v4_population_count_rev11_2pt5_min.nc")

# Creating a subset only with population in 2000
population_2000 <- subset(population_data, "X1")  
population_2000
population_2000_brick= brick(population_2000)

# Loading national boundaries shapefile from https://international.ipums.org/international/gis.shtml
world_countries = st_read(".../1_Data/world_example/IPUMSI_world_release2020/world_countries_2020.shp")

# Loading ERA5 data on surface temperature in June 2000
era_2000_brick = brick(".../1_Data/world_example/era_june_2000_2tm.nc")
era_2000_layer = raster(".../1_Data/world_example/era_june_2000_2tm.nc")

# Computing population weights
population_weights <- secondary_weights(secondary_raster = population_2000, grid = era_2000_layer)

summary(population_weights)
     x                y               weight        
 Min.   :-179.8   Min.   :-89.750   Min.   :     0.0  
 1st Qu.: -90.0   1st Qu.:-44.812   1st Qu.:     0.3  
 Median :   0.0   Median :  0.125   Median :    18.1  
 Mean   :   0.0   Mean   :  0.125   Mean   :   671.7  
 3rd Qu.:  90.0   3rd Qu.: 45.062   3rd Qu.:   248.4  
 Max.   : 179.8   Max.   : 90.000   Max.   :282746.6  
                                                    NA's   :766067    

population_weights

x           | y   |  weight |   |
-179.75 | 90 | NA |   |  
-179.50 | 90 | NA |   |  
-179.25 | 90 | NA |   |  
-179.00 | 90 | NA |   |  
-178.75 | 90 | NA |   |  
-178.50 | 90 | NA |   |  
-178.25 | 90 | NA |   |  
-178.00 | 90 | NA |   |  
-177.75 | 90 | NA |   |  
-177.50 | 90 | NA |   |

countries_weights <-  overlay_weights(
  polygons = world_countries,
  polygon_id_col = "CNTRY_CODE",
  grid = era_2000_layer,
  secondary_weights = population_weights
)
# Which leads to the following error:
### Warning: Warning: some of the secondary weights are NA, meaning weights cannot be calculated and area-weights will be returned

population_2000_brick has 2.828406e+07 NA values, so I try to reclassify them to be equal to 0 and run the same code again, but I still get an error.

# Replacing NA values with 0s in population_2000_brick
population_2000_brick <- reclassify(population_2000_brick, cbind(NA, 0))

population_weights <- secondary_weights(secondary_raster = population_2000, grid = era_2000_layer)
summary(population_weights)
       x                y               weight        
 Min.   :-179.8   Min.   :-89.750   Min.   :     0.0  
 1st Qu.: -90.0   1st Qu.:-44.812   1st Qu.:     0.0  
 Median :   0.0   Median :  0.125   Median :     0.0  
 Mean   :   0.0   Mean   :  0.125   Mean   :   161.6  
 3rd Qu.:  90.0   3rd Qu.: 45.062   3rd Qu.:     0.0  
 Max.   : 179.8   Max.   : 90.000   Max.   :183631.0  

countries_weights <-  overlay_weights(
  polygons = world_countries,
  polygon_id_col = "CNTRY_CODE",
  grid = era_2000_layer,
  secondary_weights = population_weights
)

# Which still leads to the following error:
### Warning: Warning: weight = 0 for all pixels in some of your polygons; area-weights will be returnedWarning: Warning: some of the secondary weights are NA, meaning weights cannot be calculated and area-weights will be returned> 
saraorofino commented 11 months ago

Hi @giovannibrocca thank you for opening this issue and apologies for our delay in getting back to you. We just pushed a few changes to the package that we think might help related to your first point where the secondary_weights() function is returning lots of NA values. Could you try running the code using secondary_weights() again to see if anything changed? You will probably have to reinstall the package using devtools::install_github("tcarleton/stagg") in order to get the updated function.

A few additional thoughts:

We are still working through how we want to treat cells with NAs and 0s in the secondary weights files and we plan to discuss this when we meet next week so I'll be in touch after that with additional information.

traceybit commented 9 months ago

hi @giovannibrocca -- we made some updates to the package that might help address your concerns. if you are still experiencing issues, please reach out!