CSTARS / ecosis

Ecological Spectral Information System (EcoSIS)
4 stars 0 forks source link

EcoSIS.org SSL error #50

Closed serbinsh closed 4 years ago

serbinsh commented 4 years ago

It looks like ecosis.org may have an out-of-date SSL certificate? I am finding this both for the main site and data site. This also impacts the API in R, etc

Screen Shot 2020-08-26 at 1 09 03 PM

serbinsh commented 4 years ago

https://github.com/TESTgroup-BNL/How_to_PLSR/issues/18

jrmerz commented 4 years ago

Oops. Sorry about that. They have been updated.

serbinsh commented 2 years ago

@jrmerz is this an issue again?

> dat_raw <- spectratrait::get_ecosis_data(ecosis_id = ecosis_id)
[1] "**** Downloading Ecosis data ****"
Downloading data...
Error in open.connection(con, "rb") : 
  SSL certificate problem: certificate has expired
jrmerz commented 2 years ago

Sorry again, my alerts about renewing are clearly failing. Should be good to go.

serbinsh commented 2 years ago

No problem! but it seems I am still unable to use the API

> test_check("spectratrait")
[1] "**** Downloading Ecosis data ****"
══ Failed tests ════════════════════════════════════════════════════════════════
── Error (test.get_ecosis_data.R:5:3): Downloading data from EcoSIS doesnt throw an error ──
Error in `open.connection(con, "rb")`: SSL certificate problem: certificate has expired
Backtrace:
    █
 1. └─spectratrait::get_ecosis_data(ecosis_id = ecosis_id) test.get_ecosis_data.R:5:2
 2.   └─readr::read_csv(ecosis_file)
 3.     └─readr:::read_delimited(...)
 4.       └─readr:::datasource_connection(file, skip, skip_empty_rows, comment)
 5.         └─readr:::read_connection(path)
 6.           ├─base::open(con, "rb")
 7.           └─base::open.connection(con, "rb")

[ FAIL 1 | WARN 0 | SKIP 0 | PASS 2 ]
Error: Test failures
Execution halted

Perhaps will take some time to refresh?

serbinsh commented 2 years ago

In R I am basically trying to run this

get_ecosis_data <- function(ecosis_id = NULL) {
  if(!is.null(ecosis_id)) {
    print("**** Downloading Ecosis data ****")
    ecosis_id <- ecosis_id
    ecosis_file <- sprintf(
      "https://ecosis.org/api/package/%s/export?metadata=true",
      ecosis_id)
    message("Downloading data...")
    dat_raw <- readr::read_csv(ecosis_file)
    message("Download complete!")
    return(dat_raw)
  } else {
    stop("**** No EcoSIS ID provided.  Please provide a valid ID before proceeding ****")
  }
}

ecosis_id <- "960dbb0c-144e-4563-8117-9e23d14f4aa9"

dat_raw <- spectratrait::get_ecosis_data(ecosis_id = ecosis_id)
head(dat_raw)
names(dat_raw)[1:40]

But still getting

> dat_raw <- spectratrait::get_ecosis_data(ecosis_id = ecosis_id)
[1] "**** Downloading Ecosis data ****"
Downloading data...
Error in open.connection(con, "rb") : 
  SSL certificate problem: certificate has expired
jrmerz commented 2 years ago

Humm, cert looks good on my end. Additionally I have checked with some 3rd party tools. Is there a caching layer somewhere? R or otherwise?

serbinsh commented 2 years ago

Yes strange - I can do this

https://ecosis.org/api/package/960dbb0c-144e-4563-8117-9e23d14f4aa9/export?metadata=true

and it will download the csv file. The function I wrote for pulling from EcoSIS use a URL like that to get the data. So now I need to figure out why I am getting an SSL error when trying to run

get_ecosis_data <- function(ecosis_id = NULL) {
  if(!is.null(ecosis_id)) {
    print("**** Downloading Ecosis data ****")
    ecosis_id <- ecosis_id
    ecosis_file <- sprintf(
      "https://ecosis.org/api/package/%s/export?metadata=true",
      ecosis_id)
    message("Downloading data...")
    dat_raw <- readr::read_csv(ecosis_file)
    message("Download complete!")
    return(dat_raw)
  } else {
    stop("**** No EcoSIS ID provided.  Please provide a valid ID before proceeding ****")
  }
}

basically the

dat_raw <- readr::read_csv(ecosis_file)

part

This is new behavior

> dat_raw <- readr::read_csv("https://ecosis.org/api/package/960dbb0c-144e-4563-8117-9e23d14f4aa9/export?metadata=true")
Error in open.connection(con, "rb") : 
  SSL certificate problem: certificate has expired
serbinsh commented 2 years ago

Wow...sigh. It looks like this could really be related to using an older Mac OS and the fact that upstream certs changed and are no longer compatible with my computer, I can test this on my laptop to confirm

https://stackoverflow.com/questions/69441209/r-webscraping-ssl-certificate-problem-certificate-has-expired-but-works-in-br

https://stackoverflow.com/questions/62139904/api-request-and-error-in-curlcurl-fetch-memoryurl-handle-handle-ssl-cer

serbinsh commented 2 years ago
> httr::GET("https://ecosis.org/api/package/960dbb0c-144e-4563-8117-9e23d14f4aa9/export?metadata=true")
Error in curl::curl_fetch_memory(url, handle = handle) : 
  SSL certificate problem: certificate has expired
 > curl::curl_fetch_memory("https://ecosis.org/api/package/960dbb0c-144e-4563-8117-9e23d14f4aa9/export?metadata=true")
Error in curl::curl_fetch_memory("https://ecosis.org/api/package/960dbb0c-144e-4563-8117-9e23d14f4aa9/export?metadata=true") : 
  SSL certificate problem: certificate has expired

I think my PC may have expired certs. https://security.stackexchange.com/questions/232445/https-connection-to-specific-sites-fail-with-curl-on-macos

I will see if I can try and update. I will also test my laptop. It seems strange that it works in the browser but not via curl so perhaps I need to update that as well

Clearly its finally time to sunset Mojave - its just a pain to rebuild this computer! sigh

serbinsh commented 2 years ago

Also strange - at command line (terminal) this works

curl --insecure https://ecosis.org/api/package/960dbb0c-144e-4563-8117-9e23d14f4aa9/export?metadata=true

Also

> library(openssl)
Linking to: OpenSSL 1.1.1h  22 Sep 2020
> cert <- download_ssl_cert("www.r-project.org")
> print(cert)
[[1]]
[x509 certificate] *.r-project.org
md5: 
sha1: 

[[2]]
[x509 certificate] Sectigo RSA Domain Validation Secure Server CA
md5: 
sha1: 

[[3]]
[x509 certificate] USERTrust RSA Certification Authority
md5: 
sha1: 

> print(as.list(cert[[1]]))
$subject
[1] "CN=*.r-project.org"

$issuer
[1] "CN=Sectigo RSA Domain Validation Secure Server CA,O=Sectigo Limited,L=Salford,ST=Greater Manchester,C=GB"

$algorithm
[1] "sha256WithRSAEncryption"

$signature

$validity
**[1] "Aug  4 00:00:00 2020 GMT" "Nov  2 23:59:59 2022 GMT"**

$self_signed
[1] FALSE

$alt_names
[1] "*.r-project.org" "r-project.org"  

$pubkey
[2048-bit rsa public key]
md5: 

> cert_verify(cert, ca_bundle())
[1] TRUE
serbinsh commented 2 years ago

Strange - I am not having this issue with other URLs? Looking at some examples here and I am not getting the same error?

https://cran.r-project.org/web/packages/curl/vignettes/intro.html
> library(curl)
Using libcurl 7.54.0 with LibreSSL/2.6.5
> 
> help(curl_fetch_echo)
> res <- curl_fetch_memory("http://httpbin.org/cookies/set?foo=123&bar=ftw")
> res$content
 [1] 7b 0a 20 20 22 63 6f 6f 6b 69 65 73 22 3a 20 7b 0a 20 20 20 20 22 62 61 72 22 3a 20 22 66 74 77 22 2c 20 0a 20 20 20 20 22 66 6f 6f 22 3a
[47] 20 22 31 32 33 22 0a 20 20 7d 0a 7d 0a
> res <- curl_fetch_stream("http://www.httpbin.org/drip?duration=3&numbytes=15&code=200", function(x){
+     cat(rawToChar(x))
+ })
***************
serbinsh commented 2 years ago

Update - If I dont use --insecure

wolfmanodesktop:~ sserbin$ curl https://ecosis.org/api/package/960dbb0c-144e-4563-8117-9e23d14f4aa9/export?metadata=true
curl: (60) SSL certificate problem: certificate has expired
More details here: https://curl.haxx.se/docs/sslcerts.html

curl performs SSL certificate verification by default, using a "bundle"
 of Certificate Authority (CA) public keys (CA certs). If the default
 bundle file isn't adequate, you can specify an alternate file
 using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
 the bundle, the certificate verification probably failed due to a
 problem with the certificate (it might be expired, or the name might
 not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
 the -k (or --insecure) option.
HTTPS-proxy has similar options --proxy-cacert and --proxy-insecure.
serbinsh commented 2 years ago

Ok fixed!

I replaced cert.pem in /etc/ssl

I renamed the old one and replaced it with the latest from here: https://curl.se/docs/caextract.html

after doing that it seems to be working fine in curl and in R. My guess is this is related to my older OS and Apple has stopped providing SSL certificate updates? So I had to do it manually I guess....but just a guess

> ### Get source dataset from EcoSIS
> dat_raw <- spectratrait::get_ecosis_data(ecosis_id = ecosis_id)
[1] "**** Downloading Ecosis data ****"
Downloading data...

── Column specification ─────────────────────────────────────────────────────────────────────────────────────────────────
cols(
  .default = col_double(),
  BNL_Barcode = col_character(),
  `Common Name` = col_character(),
  `Foreoptic Specifications` = col_character(),
  `Instrument Model` = col_character(),
  `Latin Genus` = col_character(),
  `Latin Species` = col_character(),
  `Measurement Quantity` = col_character(),
  Overlap_Handling = col_character(),
  Overlap_Matching_Type = col_character(),
  Overlap_Removal = col_character(),
  `Processing Interpolated` = col_character(),
  `Processing Resampled` = col_character(),
  Reflectance_Type = col_character(),
  Site = col_character(),
  Spectra_Name = col_character(),
  Spectra_Type = col_character(),
  Spectra_Units = col_character(),
  Spectral_Resolution = col_character(),
  `USDA Symbol` = col_character(),
  White_Reference_Standard = col_character()
)
ℹ Use `spec()` for the full column specifications.

|=============================================================================================================| 100% 3 MB
Download complete!
> head(dat_raw)
# A tibble: 6 × 2,181
  BNL_Barcode CN_Ratio C_area_g_m2 Cmass_g_g `Common Name`  `Foreoptic Specifica… `Instrument Mod… LMA_g_m2 `Latin Genus`
  <chr>          <dbl>       <dbl>     <dbl> <chr>          <chr>                 <chr>               <dbl> <chr>        
1 BNL2181         19.3        48.1      47.8 Siberian alder Fiber_1_LC_RP_Pro     SVC_HR-1024i        101.  Alnus        
2 BNL2194         21.7        43.9      50.8 tealeaf willow Fiber_1_LC_RP_Pro     SVC_HR-1024i         86.4 Salix        
3 BNL2195         31.6        48.3      50.5 tealeaf willow Fiber_1_LC_RP_Pro     SVC_HR-1024i         95.7 Salix        
4 BNL2196         28.0        49.5      48.4 Siberian alder Fiber_1_LC_RP_Pro     SVC_HR-1024i        102.  Alnus        
5 BNL2197         25.1        44.6      49.0 tealeaf willow Fiber_1_LC_RP_Pro     SVC_HR-1024i         91.0 Salix        
6 BNL2198         17.5        51.8      47.3 tealeaf willow Fiber_1_LC_RP_Pro     SVC_HR-1024i        110.  Salix        
# … with 2,172 more variables: Latin Species <chr>, Latitude <dbl>, Longitude <dbl>, Measurement Quantity <chr>,
#   N_area_g_m2 <dbl>, Nmass_g_g <dbl>, Overlap_Handling <chr>, Overlap_Matching_Type <chr>, Overlap_Removal <chr>,
#   Processing Interpolated <chr>, Processing Resampled <chr>, Reflectance_Type <chr>, Sample Collection Date <dbl>,
#   Sample_ID <dbl>, Site <chr>, Spectra_Name <chr>, Spectra_Type <chr>, Spectra_Units <chr>, Spectral_Resolution <chr>,
#   USDA Symbol <chr>, White_Reference_Standard <chr>, 350 <dbl>, 351 <dbl>, 352 <dbl>, 353 <dbl>, 354 <dbl>, 355 <dbl>,
#   356 <dbl>, 357 <dbl>, 358 <dbl>, 359 <dbl>, 360 <dbl>, 361 <dbl>, 362 <dbl>, 363 <dbl>, 364 <dbl>, 365 <dbl>,
#   366 <dbl>, 367 <dbl>, 368 <dbl>, 369 <dbl>, 370 <dbl>, 371 <dbl>, 372 <dbl>, 373 <dbl>, 374 <dbl>, 375 <dbl>, …
> names(dat_raw)[1:40]
 [1] "BNL_Barcode"              "CN_Ratio"                 "C_area_g_m2"              "Cmass_g_g"               
 [5] "Common Name"              "Foreoptic Specifications" "Instrument Model"         "LMA_g_m2"                
 [9] "Latin Genus"              "Latin Species"            "Latitude"                 "Longitude"               
[13] "Measurement Quantity"     "N_area_g_m2"              "Nmass_g_g"                "Overlap_Handling"        
[17] "Overlap_Matching_Type"    "Overlap_Removal"          "Processing Interpolated"  "Processing Resampled"    
[21] "Reflectance_Type"         "Sample Collection Date"   "Sample_ID"                "Site"                    
[25] "Spectra_Name"             "Spectra_Type"             "Spectra_Units"            "Spectral_Resolution"     
[29] "USDA Symbol"              "White_Reference_Standard" "350"                      "351"                     
[33] "352"                      "353"                      "354"                      "355"                     
[37] "356"                      "357"                      "358"                      "359"                     
> #--------------------------------------------------------------------------------------------------#
jrmerz commented 2 years ago

Yes, the CA certs are normally part of your OS updates and is (one of the many) reasons why it's good to keep it patched. I would put out a word of caution manually replacing root certs on your machine.

serbinsh commented 2 years ago

Understood. Really this means its time to bite the bullet and wipe and reinstall on my mac since my OS version is no longer getting patched.