mikejohnson51 / climateR

An R 📦 for getting point and gridded climate data by AOI
https://mikejohnson51.github.io/climateR/
MIT License
165 stars 40 forks source link

Question on using the precip data #83

Open ldecicco-USGS opened 8 months ago

ldecicco-USGS commented 8 months ago

I'm curious if you have any references on how to extract the data from this package for my specific needs (which I think are a little common). I'm hoping it's basically a matter of pointing me towards some other package that I should learn about.

Here's my ideal workflow:

  1. I have a USGS site.
  2. I get the basin shapefile (from NHDplusTools?)
  3. I get daily precipitation from climateR (getGridMET was what I started with)
  4. I run some code to get the sum of the precip within the basin per day (so basically a data frame with date/time, precip).

Any chance you have an example that does something like this? I like all the plotting examples, but I'm not sure how to get the numbers out. My ultimate goal is to look at precip and discharge at particular sites.

Rapsodia86 commented 8 months ago

Will this help?

library(climateR)
library(terra)
library(AOI)
AOI = aoi_get(state = "NY")

test_data <- getGridMET(
  AOI = AOI,
  varname = "pr",
  startDate = "2023-09-29",
  endDate  = "2023-10-29")

aoi_proj <- project(vect(AOI),test_data$precipitation_amount)

pr_crop <-crop(test_data$precipitation_amount,aoi_proj)

pr_df1 <- t(terra::extract(pr_crop,aoi_proj,touches=F,sum, na.rm=TRUE))  #difference in masking approach. See below or in the help of the "terra" package.
pr_df2 <- t(terra::extract(pr_crop,aoi_proj,touches=T,sum, na.rm=TRUE))

#touches logical. If TRUE, values for all cells touched by lines or polygons are extracted, not just those on the line render path, or whose center point is within the polygon. Not relevant for points; and always considered TRUE when weights=TRUE or exact=TRUE
ldecicco-USGS commented 8 months ago

AWESOME, thanks! Here's what I added on in case anyone else is trying to do the same thing:

library(AOI)
library(terra)
library(climateR)
library(dataRetrieval)
library(sf)
library(nhdplusTools)
library(tidyverse)

nldi_nwis <- list(featureSource = "nwissite",
                  featureID = "USGS-04087214")

site <- get_nldi_feature(nldi_nwis)

basin <- get_nldi_basin(nldi_feature = nldi_nwis)

AOI = aoi_get(x = basin)

test_data <- getGridMET(
  AOI = AOI,
  varname = "pr",
  startDate = "2023-09-29",
  endDate  = "2023-10-29")

aoi_proj <- project(vect(AOI),test_data$precipitation_amount)

pr_crop <- crop(test_data$precipitation_amount,aoi_proj)

pr_df1 <- t(terra::extract(pr_crop,aoi_proj,touches=F,sum, na.rm=TRUE))  #difference in masking approach. See below or in the help of the "terra" package.
pr_df1 <- pr_df1[-1, ]
pr_df1 <- data.frame(pr_df1)
pr_df1$Date <- as.Date(gsub("pr_", "", row.names(pr_df1)))
names(pr_df1) <- c("precip", "Date")

discharge <- readNWISdv("04087214", "00060",
                        startDate = "2023-09-29",
                        endDate  = "2023-10-29")

discharge2 <- discharge |> 
  rename(Discharge = X_00060_00003) |> 
  left_join(pr_df1, by = "Date") |> 
  pivot_longer(names_to = "param", 
               cols = c(Discharge, precip))

ggplot(data = discharge2,
       aes(x = Date, y = value)) +
  geom_point() +
  geom_line() +
  facet_grid(param ~ ., scales = "free_y")

image

ldecicco-USGS commented 8 months ago

Is there a favorite sub-daily precip source? I can put together a quick example of sub-daily if that would be helpful (and if such precip data exists which I assume does...)

mikejohnson51 commented 8 months ago

Hi @ldecicco-USGS and @Rapsodia86,

Thanks for the awesome examples! I'll add one using the zonal package we've built for some weather service work. On larger (or incremental) basins, or larger time extracts, its proven to be a bit more performant.

library(climateR)
library(dataRetrieval)
library(zonal) #mikejohnson51/zonal
library(tidyverse)
library(terra)

siteID <- '04087214'
startDate <- "2023-09-29"
endDate  <-"2023-10-29"

# Get basin with dataRetrieval
basin <- findNLDI(nwis = siteID, find = "basin")[['basin']] |>
  mutate(siteID = siteID)

# Get discharge with dataRetrieval
discharge <- readNWISdv(siteID, "00060", startDate = startDate, endDate  = endDate ) |>
  renameNWISColumns() |>
  select(Date, Flow)

# Get rainfall grids with climateR
pr <- getGridMET(AOI = basin, varname = "pr", startDate = startDate, endDate  = endDate) 

# summarize rainfall to basin with zonal
data <-execute_zonal(pr, geom = basin, ID = "siteID", fun = "sum", join = F) |>
  pivot_longer(-siteID) |>
  mutate(Date = as.Date(time(pr[[1]]))) |>
  select(Date, pr = value) |>
  left_join(discharge, by = "Date")

# Build plot
pivot_longer(data, names_to = "param", cols = c(Flow, pr)) |>
 ggplot(aes(x = Date, y = value)) +
  geom_point() +
  geom_line() +
  facet_grid(param ~ ., scales = "free_y")

Created on 2023-11-02 by the reprex package (v2.0.1)

So sub daily, NLDAS is often used and climateR does offer NLDAS support. However, I just found an error was returned from the server and will look into it! More to come on that (hopefully soon).

mikejohnson51 commented 8 months ago

For now - it does appear the NLDAS server is down (https://hydro1.gesdisc.eosdis.nasa.gov/dods/NLDAS_FORA0125_H.002.das) when that is back up I will share an example!

mikejohnson51 commented 8 months ago

Looks like its back up! Using the same basin as above we can get NLDAS forcing data. For this one you will need a NASA EarthData account and to populate climateR::writewriteNetrc() ...

After that, both @Rapsodia86's excellent terra solution, or, the zonal solution would work to summarize the data.

Hope it helps!

# Get rainfall grids with climateR
pr <- getNLDAS(AOI = basin, 
               model = "FORA0125_H.002",
               varname = "apcpsfc",
               startDate = startDate, endDate  = endDate) 

# summarize rainfall to basin with zonal
data <-execute_zonal(pr, geom = basin, ID = "siteID", fun = "sum", join = F) |>
  pivot_longer(-siteID) |>
  mutate(Date = time(pr[[1]]))

ggplot(data, aes(x = Date, y = value)) +
  geom_point() +
  geom_line() 

Created on 2023-11-07 by the reprex package (v2.0.1)

ldecicco-USGS commented 8 months ago

I created an Earthdata account and ran the writeNetrc function. When I then ran:

pr <- getNLDAS(AOI = basin, 
               model = "FORA0125_H.002",
               varname = "apcpsfc",
               startDate = startDate, endDate  = endDate) 
Note:Caching=1
Error:curl error: SSL peer certificate or SSH remote key was not OK
curl error details: 
Warning:oc_open: Could not read url
Error in open.nc(glue("{url}?{T_name}")) : NetCDF: I/O failure

Does that error ring a bell?

mikejohnson51 commented 8 months ago

Hi @ldecicco-USGS! That error seemed familiar and I checked an old chat history with @program-- .

The quick fix should be to set Sys.setenv(CURLOPT_SSL_VERIFYPEER = FALSE) following this.

However turning off SSL verification shouldn't be a permanent solution because of the security risks associated with it (that said I trust the NASA endpoint completely, and their cert seems ok). It might be worth checking with the USGS IT folks (assuming you're on a GFE) about the message.

Hope that helps and maybe @program-- can provide a little more nuance if I have missed any.

program-- commented 8 months ago

@ldecicco-USGS To add on to what @mikejohnson51 said, I'm not sure how USGS handles it, but I know other agencies have proxies that handle SSL termination when on GFE/using an agency VPN. That might also be a cause (again, under the assumption that you're on GFE or using a VPN).

ldecicco-USGS commented 8 months ago

I've tried both on and off our VPN with no luck.

I added:

Sys.setenv(CURLOPT_SSL_VERIFYPEER = FALSE,
           CURLOPT_SSL_VERIFYHOST = FALSE,
           CURLOPT_SSL_VERIFYSTATUS = FALSE,
           CURLOPT_VERBOSE = TRUE)
pr2 <- getNLDAS(AOI = basin, 
               model = "FORA0125_H.002",
               varname = "apcpsfc",
               startDate = startDate, endDate  = endDate) 
Note:Caching=1
*   Trying 198.118.197.99:443...
* Connected to hydro1.gesdisc.eosdis.nasa.gov (198.118.197.99) port 443 (#0)
* ALPN: offers http/1.1
* SSL certificate problem: unable to get local issuer certificate
* Closing connection 0
Error:curl error: SSL peer certificate or SSH remote key was not OK
curl error details: 
Warning:oc_open: Could not read url
Error in open.nc(glue("{url}?{T_name}")) : NetCDF: I/O failure

I think "unable to get local issuer certificate" might mean it's a me problem 🤷‍♀️.

I know our IT department sets this by default:

Initiating curl with CURL_SSL_BACKEND: openssl
curl::curl_version()$ssl_version
[1] "OpenSSL/3.1.2 (Schannel)"

I'll try to figure out if there's a simple setting I can adjust. Or possibly I need to copy our crt file somewhere else.

@dblodgett-usgs are you able to get this to work on a gov computer? And if so, did you do anything special?