mikejohnson51 / nwmTools

Set of tools to work with National Water Model output
https://mikejohnson51.github.io/nwmTools/
Creative Commons Zero v1.0 Universal
18 stars 5 forks source link

curl error with readNWMdata #7

Open psavoy-usgs opened 1 year ago

psavoy-usgs commented 1 year ago

I've previously used the package to download >10,000 reaches of data without issue. However, now readNWMdata gives me the following error.

readNWMdata(comid = 17595383)

Note:Caching=1 Error:curl error: SSL peer certificate or SSH remote key was not OK curl error details: Warning:oc_open: Could not read url Error in open.nc(call.meta$url[1]) : NetCDF: I/O failure

I suspected it was perhaps an issue with my R version so just updated R, Rstudio, and all packages but the issue persists. The only other thing I could think of is that the thredds url has changed again.

System details R version: 4.2.3 Rtools version 4.2 curl version 5.0.0

mikejohnson51 commented 1 year ago

Hi @psavoy-usgs,

I am not seeing this here:

library(nwmTools)

xx = readNWMdata(comid = 17595383)

plot(xx$dateTime, xx$flow_cms_v2.1, type = "l")

Created on 2023-07-12 by the reprex package (v2.0.1)

Are you still getting this error?

Thanks!

psavoy-usgs commented 1 year ago

@mikejohnson51 Yes I am still having this issue and have since updated R, Rtools, and Rstudio again but the issue remains. I am honestly not sure what might be causing the issue unless there is a versioning issue with some dependencies. I could see if I can replicate this error on my personal computer to perhaps isolate what is causing the issue on my work computer if that would be helpful. Here is my output from sessionInfo() , and I will clarify that I am also running Rtools 4.3.

R version 4.3.0 (2023-04-21 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale: [1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8
[3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C
[5] LC_TIME=English_United States.utf8

time zone: America/New_York tzcode source: internal

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] nwmTools_0.0.4

loaded via a namespace (and not attached): [1] utf8_1.2.3 generics_0.1.3 tidyr_1.3.0 class_7.3-22 xml2_1.3.5
[6] KernSmooth_2.23-22 magrittr_2.0.3 grid_4.3.0 timechange_0.2.0 rprojroot_2.0.3
[11] jsonlite_1.8.7 dataRetrieval_2.7.12 processx_3.8.2 zip_2.3.0 pkgbuild_1.4.2
[16] e1071_1.7-13 DBI_1.1.3 ps_1.7.5 httr_1.4.6 rvest_1.0.3
[21] purrr_1.0.1 fansi_1.0.4 pbapply_1.7-2 codetools_0.2-19 cli_3.6.1
[26] RNetCDF_2.6-2 rlang_1.1.1 crayon_1.5.2 units_0.8-2 remotes_2.4.2
[31] RANN_2.6.1 tools_4.3.0 fst_0.9.8 parallel_4.3.0 fstcore_0.9.14
[36] dplyr_1.1.2 curl_5.0.1 vctrs_0.6.3 nhdplusTools_0.6.2 R6_2.5.1
[41] proxy_0.4-27 lifecycle_1.0.3 lubridate_1.9.2 classInt_0.4-9 pkgconfig_2.0.3
[46] desc_1.4.2 callr_3.7.3 terra_1.7-39 pillar_1.9.0 glue_1.6.2
[51] Rcpp_1.0.11 sf_1.0-14 tibble_3.2.1 tidyselect_1.2.0 rstudioapi_0.15.0
[56] compiler_4.3.0 prettyunits_1.1.1

psavoy-usgs commented 1 year ago

I have asked several colleagues to run the code you provided and I think I am quite certain that the issue originated once I switched to R version 4.2. I am not sure of the root issue, but everyone on prior installations was able to run the code and everyone on 4.2 or later encountered the same error as myself.

mikejohnson51 commented 1 year ago

Interesting! Are they all on Windows systems? I am on 4.2.1 with a Mac and things are working.

psavoy-usgs commented 1 year ago

Interesting! Are they all on Windows systems? I am on 4.2.1 with a Mac and things are working.

So I think that may be the issue, I have encountered other issues with curl and OS. We were all on windows machines but several of us have run into this kind of issue where it could not be reproduced on Mac/linux due to some interaction with how systems use curl. I can do some more digging to see if I can find similar examples.

psavoy-usgs commented 1 year ago

If it is useful I just checked my machine from the command line and have curl 8.0.1 and libcurl 8.0.1. I think there are instances that demonstrate curl behaves differently on Windows and Mac OS, but also just trying to rule out more obvious things like different system versions of curl.

mikejohnson51 commented 1 year ago

@program--, do you have any thoughts on this Windows/R Version/curl issue?

program-- commented 1 year ago

@psavoy-usgs Are you by chance using a proxy or VPN? The error:

Error:curl error: SSL peer certificate or SSH remote key was not OK

This error can happen if a proxy/VPN is handling SSL/TLS termination. One thing you could try is setting the environment variable CURLOPT_SSL_VERIFYPEER to 0 and then rerunning the code to see if it works then (or at least gives a different error).

In R you can do that like this:

Sys.setenv(CURLOPT_SSL_VERIFYPEER = 0)

Warning: This is not a permanent solution, and it's not advised to use this in any production system due to security issues.

Additionally, R 4.2.2 introduced a bug fix for curl revocation checks that might give some info:

On Windows, environment variable R_LIBCURL_SSL_REVOKE_BEST_EFFORT can be used to switch to only ‘best-effort’ SSL certificate revocation checks with the de- fault "libcurl" download method. This reduces security, but may be needed for downloads to work with MITM proxies (PR#18379)

(from R release notes)


If that doesn't give an indication to the issue, then could you try enabling curl verbosity with:

Sys.setenv(CURLOPT_VERBOSE = 1)

and appending the output of the code after enabling that to this thread?

psavoy-usgs commented 1 year ago

@program-- Thanks for the useful information. Since I am working from a government computer I do not want to mess with anything that results in less secure connections and draw the ire of the IT department. Your point about the 4.2.2 bugfix makes sense with the timing when this issue arose for me and agrees with other colleagues that were or were not able to run the code. I tried running things again with the verbose settings and this is what I have:

I consistently pull a lot of data so I am not sure specifically what the culprit is for this issue with this package. I tried this both on and off a VPN and get the same error regardless.

program-- commented 1 year ago

The verbose message:

SSL certificate problem: self signed certificate in certificate chain

implies (but doesn't necessarily confirm) there is something responding with a certificate that shadows the SSL cert of cida.usgs.gov (if there is a MITM proxy, then this typically would).

My best guess is that this is something you'd need to inquire your IT department about, since if your GFE is configured with a proxy, then IT should've ensured that it's responding certificates are trusted on the client.

One more test you could do, if you're able to, is try the same code on a non-government computer, since if that works then the issue isn't GFE-specific. Alternatively, reverting to R 4.2.1 might work?


From an IT perspective as well: I think that enabling R_LIBCURL_SSL_REVOKE_BEST_EFFORT should be safe, assuming the returned certificate is in fact from a trusted proxy and not a malicious one. The biggest security concern is when the certificate is being bypassed to access sensitive information. Though, take this with a grain of salt since I don't know how your GFE is governed on your IT dept's side.


EDIT: I do agree that it is weird though that it seems to manifest primarily when using this package. If you try to do a GET request on a different thredds server being hosted somewhere else, I wonder if it would give the same issue.

csimeone-usgs commented 5 months ago

@psavoy-usgs Did you ever get this issue solved? I'm running into the same issue running this from a USGS machine.