ropensci / rebird

Wrapper to the eBird API
https://docs.ropensci.org/rebird
Other
83 stars 17 forks source link

ebird_GET() : Unknown species: Genus%20species = double url encoding ? #62

Closed Alanamosse closed 6 years ago

Alanamosse commented 6 years ago

I couldn't use ebird_GET() without a warning like : In ebird_GET(url, args, ...) : Unknown species: Anas%20platyrhynchos

It seems that the space charater ' ' is encoded two times : ' ' ->'%20' ->'%2520'

GET function use an url with 'Anas%2520platyrhynchos' instead of 'Anas%20platyrhynchos'. See the following example with the verbose enabled : ``

ebirdgeo('Anas platyrhynchos', 39, -121, max=5, config=verbose()) -> GET /ws1.1/data/obs/geo_spp/recent?fmt=json&sci=Anas%20platyrhynchos&lat=39&lng=-121&maxResults=5 HTTP/1.1 -> Host: ebird.org -> User-Agent: libcurl/7.59.0 r-curl/3.2 httr/1.3.1 -> Accept-Encoding: gzip, deflate -> Accept: application/json, text/xml, application/xml, / -> <- HTTP/1.1 302 Found <- Date: Tue, 05 Jun 2018 13:39:33 GMT <- Server: Apache <- Location: https://ebird.org/ws1.1/data/obs/geo_spp/recent?fmt=json&sci=Anas%2520platyrhynchos&lat=39&lng=-121&maxResults=5 <- Content-Length: 312 <- Content-Type: text/html; charset=iso-8859-1 <- X-OSSProxy: OSSProxy 1.3.337.412 (Build 337.412 Win32 en-us)(Jan 11 2018 14:07:40) <- Connection: keep-alive <- -> GET /ws1.1/data/obs/geo_spp/recent?fmt=json&sci=Anas%2520platyrhynchos&lat=39&lng=-121&maxResults=5 HTTP/1.1 -> Host: ebird.org -> User-Agent: libcurl/7.59.0 r-curl/3.2 httr/1.3.1 -> Accept-Encoding: gzip, deflate -> Accept: application/json, text/xml, application/xml, / -> <- HTTP/1.1 400 400 <- Date: Tue, 05 Jun 2018 13:39:33 GMT <- Server: Apache <- Content-Encoding: gzip <- Content-Type: application/json;charset=utf-8 <- Content-Length: 96 <- Connection: close <- [1] NA Warning message: In ebird_GET(url, args, ...) : Unknown species: Anas%20platyrhynchos ``

I tried it using the v0.4.0 in rstudio (windows) and with linux Rscripts, same issue when using the spocc package. Do you know if it's a wrong use of the function or a bug with the request ? Thank for your help

sebpardo commented 6 years ago

Hi @Alanamosse, sorry for the late reply, I've been away at a conference.

The issue seems to be with a redirect from the eBird API (<- HTTP/1.1 302 Found), which is where the second encoding of the URL provided is happening. It seems this occurs because the base URL that rebird v0.4.0 is using is not secure (http), so it forces the switch to https, and adds an extra step of encoding in the process.

The development version of rebird automatically uses https, so it should work well if you install from Github: devtools::install_github("ropensci/rebird")

Please let me know if this fixes the issue for you.

sebpardo commented 6 years ago

You can see the issue if you manually construct the GET request:


library(httr)

urlhttps <- paste0('https://ebird.org/ws1.1/', 'data/obs/', 'geo_spp/recent')
urlhttp <- paste0('http://ebird.org/ws1.1/', 'data/obs/', 'geo_spp/recent')

ebird_compact <- function(x) Filter(Negate(is.null), x)

args <- ebird_compact(list(fmt='json', sci='Anas platyrhynchos',
                           lat=round(39,2), lng=round(-121,2),
                           maxResults=5,
                           locale=NULL))

GET(urlhttps, query = args, config = verbose())
# -> GET /ws1.1/data/obs/geo_spp/recent?fmt=json&sci=Anas%20platyrhynchos&lat=39&lng=-121&maxResults=5 HTTP/1.1
# -> Host: ebird.org
# -> User-Agent: libcurl/7.58.0 r-curl/3.2 httr/1.3.1
# -> Accept-Encoding: gzip, deflate
# -> Accept: application/json, text/xml, application/xml, */*
#   -> 
#   <- HTTP/1.1 200 200
# <- Date: Tue, 19 Jun 2018 17:37:38 GMT
# <- Server: Apache
# <- Content-Encoding: gzip
# <- Content-Type: application/json;charset=utf-8
# <- Content-Length: 391
# <- 
#   Response [https://ebird.org/ws1.1/data/obs/geo_spp/recent?fmt=json&sci=Anas%20platyrhynchos&lat=39&lng=-121&maxResults=5]
# Date: 2018-06-19 17:37
# Status: 200
# Content-Type: application/json;charset=utf-8
# Size: 1.16 kB

GET(urlhttp, query = args, config = verbose())
# -> GET /ws1.1/data/obs/geo_spp/recent?fmt=json&sci=Anas%20platyrhynchos&lat=39&lng=-121&maxResults=5 HTTP/1.1
# -> Host: ebird.org
# -> User-Agent: libcurl/7.58.0 r-curl/3.2 httr/1.3.1
# -> Accept-Encoding: gzip, deflate
# -> Accept: application/json, text/xml, application/xml, */*
#   -> 
#   <- HTTP/1.1 302 Found
# <- Date: Tue, 19 Jun 2018 17:38:37 GMT
# <- Server: Apache
# <- Location: https://ebird.org/ws1.1/data/obs/geo_spp/recent?fmt=json&sci=Anas%2520platyrhynchos&lat=39&lng=-121&maxResults=5
# <- Content-Length: 312
# <- Content-Type: text/html; charset=iso-8859-1
# <- 
#   -> GET /ws1.1/data/obs/geo_spp/recent?fmt=json&sci=Anas%2520platyrhynchos&lat=39&lng=-121&maxResults=5 HTTP/1.1
# -> Host: ebird.org
# -> User-Agent: libcurl/7.58.0 r-curl/3.2 httr/1.3.1
# -> Accept-Encoding: gzip, deflate
# -> Accept: application/json, text/xml, application/xml, */*
#   -> 
#   <- HTTP/1.1 400 400
# <- Date: Tue, 19 Jun 2018 17:38:38 GMT
# <- Server: Apache
# <- Content-Encoding: gzip
# <- Content-Type: application/json;charset=utf-8
# <- Content-Length: 96
# <- Connection: close
# <- 
#   Response [https://ebird.org/ws1.1/data/obs/geo_spp/recent?fmt=json&sci=Anas%2520platyrhynchos&lat=39&lng=-121&maxResults=5]
# Date: 2018-06-19 17:38
# Status: 400
# Content-Type: application/json;charset=utf-8
# Size: 95 B
Alanamosse commented 6 years ago

Hi @sebpardo , my turn to apologize for a late reply ..

Indeed your solution worked thks for the explanation !