SESYNC-ci / FedData

Functions to Automate Downloading Geospatial Data Available from Several Federated Data Sources
https://docs.ropensci.org/FedData
Other
0 stars 1 forks source link

Problems with web mercator projection in get_nlcd() #1

Open qdread opened 3 years ago

qdread commented 3 years ago

In get_nlcd(), transformation to web mercator projection (EPSG 3857) is hard-coded. It appears to be done so that the request to the web service will work. But it causes problems because area is distorted. We should improve the behavior of the function and make a PR.

@khondula can you add your bug reprex here?

khondula commented 3 years ago

Not very minimal, but hopefully shows what's going on!

Here is the link to the web mapping service info from MRLC: https://www.mrlc.gov/geoserver/ows?service=WCS&version=2.0.1&request=GetCapabilities

library(FedData)
library(sf)
library(dplyr)

nc <- st_read(system.file("shape/nc.shp", package="sf"))
st_crs(nc) # nad27 epsg 4267
nc_ashe <- nc[1,]

# compare area of ashe county in nad27 epsg 4267 vs 3857
nc_ashe %>% st_area() %>% sum() %>% as.numeric()/1e6 # 1137 km2
nc_ashe %>% st_transform(3857) %>% st_area() %>% sum() %>% as.numeric()/1e6 # 1760 km2

# difference between areas for all counties averages 50 percent
nc %>% mutate(area_km2 = st_area(nc)/1e6,
              area_3857_km2 = st_area(st_transform(nc, 3857))/1e6) %>%
  mutate(wm_diff = (area_3857_km2 - area_km2)/area_km2) %>% 
  st_drop_geometry() %>% pull(wm_diff) %>% summary()

# use ashe co as template for get_nlcd
nc_ashe_sp <- as(nc_ashe, "Spatial")
nlcd_ashe <- get_nlcd(nc_ashe_sp, year = 2016, label = "nc1", dataset = "landcover")

# mask raster to county
nc_ashe_sp_prj <- nc_ashe %>% st_transform(proj4string(nlcd_ashe))
nlcd_ashe_mask <- raster::mask(nlcd_ashe, nc_ashe_sp_prj)

# calculate area from number and size of pixels
n_pix <- raster::freq(nlcd_ashe_mask) %>% as.data.frame() %>% 
  dplyr::filter(!is.na(value)) %>% pull(count) %>% sum()
area_m2 <- n_pix*xres(nlcd_ashe_mask)*yres(nlcd_ashe_mask)
area_m2/1e6

# project to AEA then calculate area from number and size of pixels
mrlc_prj <- "+proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=23 +lon_0=-96 +x_0=0 +y_0=0 +ellps=GRS80 +datum=NAD83 +units=m +no_defs"
nlcd_ashe_prj <- projectRaster(nlcd_ashe, crs = mrlc_prj)
nc_ashe_sp_prj2 <- nc_ashe %>% st_transform(mrlc_prj)

nlcd_ashe_mask2 <- raster::mask(nlcd_ashe_prj, nc_ashe_sp_prj2)
n_pix <- raster::freq(nlcd_ashe_mask2) %>% as.data.frame() %>% 
  dplyr::filter(!is.na(value)) %>% pull(count) %>% sum()
area_m2 <- n_pix*xres(nlcd_ashe_mask2)*yres(nlcd_ashe_mask2)
area_m2/1e6 # 17060 km2
qdread commented 3 years ago

Here is some code that is the result of trial and error, setting up the query inside get_nlcd to specify input and output CRS.

Relevant documentation

nlcd_epsg <- 5070
webmerc_epsg <- 3857

# Test with a bbox
pts <- cbind(c(-76.6, -76.6, -76.4, -76.4, -76.6), c(38.9, 39, 39, 38.9, 38.9))
polyg <- st_sfc(st_polygon(list(pts)), crs = 4326) # WGS84 Longlat polygon.

# Must transform template to Web mercator
template <- polyg %>%
  st_transform(webmerc_epsg) %>% 
  st_bbox

year = '2016'
dataset = 'Impervious'
landmass = 'L48'

coverage <- paste0("NLCD_", year, "_", dataset, "_", landmass)
source <- paste0("https://www.mrlc.gov/geoserver/mrlc_display/", coverage, "/ows")

outfile = '~/Documents/temp/foo.tif'

source %>%
  httr::GET(
    query = list(
      service = "WCS",
      version = "2.0.1",
      request = "GetCoverage",
      coverageid = coverage,
      subset = paste0('X("', template["xmin"], '","', template["xmax"], '")'),
      subset = paste0('Y("', template["ymin"], '","', template["ymax"], '")'),
      subsettingcrs = paste0("epsg:", webmerc_epsg),
      outputcrs = paste0("epsg:", nlcd_epsg)
    ),
    httr::write_disk(
      path = outfile,
      overwrite = TRUE
    )
  )

r <- raster(outfile)
qdread commented 3 years ago

Made a branch and committed 6d68c12 which modifies get_nlcd() in place but changes its name to my_get_nlcd(). I also committed a script to test it. There is now an output CRS argument within the function. But the problem is that I can't figure out how to get the input CRS to be the native CRS of the NLCD raster. It returns an error if you use input CRS other than web Mercator (by adding parameter subsettingcrs to the query list). So it looks like it is still projecting from native projection to web Mercator and back again. Not ideal.