Closed boshek closed 5 years ago
Thanks @boshek
So you want to get the full url with query parameters BEFORE the request is sent? What's the use case? Maybe we could add a fxn to Paginator
to give back the full URLs?
The use case, to me, is when developing a client to access an API (especially one that might be poorly documented) having a means to check the urls as you iteratively make calls, especially with Pagination, to develop your R package, to me, is super useful. It more closely connects the code you are writing to the specs of the API.
full url with query parameters BEFORE the request is sent
To me it doesn't matter if it is before or after. Just if I can see what is happening easily.
Maybe we could add a fxn to Paginator to give back the full URLs
To me the path of least resistance would be to have Paginator behave the same way of HttpClient. So to crib the example above, something as "simple" as this:
cc$url
#> [1] "http://geo.weather.gc.ca/geomet-beta/features/collections/hydrometric-daily-mean/items/?startindex=500"
#> [2] "http://geo.weather.gc.ca/geomet-beta/features/collections/hydrometric-daily-mean/items/?startindex=1000"
#> [3] "http://geo.weather.gc.ca/geomet-beta/features/collections/hydrometric-daily-mean/items/?startindex=1500"
That would mimic the original behaviour of HttpClient which would then make it seamless for a user. Not having experience programming in R6 I can't for certain say how challenging this is.
Thanks.
To be clear, the full URL doesn't come from HttpClient
but after making the request, in the HttpResponse
object. So in that case the full url is only available after the request is made.
Do you know about verbose curl output? Is it too verbose? iie.e., you just want the URL slash want to be able to control the output? e..g, (just showing the headers)
cc <- HttpClient$new('https://scottchamberlain.info')
cc$get(verbose = TRUE)
> GET / HTTP/1.1
Host: scottchamberlain.info
User-Agent: libcurl/7.54.0 r-curl/3.2 crul/0.6.0
Accept-Encoding: gzip, deflate
Accept: application/json, text/xml, application/xml, */*
< HTTP/1.1 200 OK
< Cache-Control: public, max-age=0, must-revalidate
< Content-Type: text/html; charset=UTF-8
< Date: Mon, 17 Sep 2018 17:09:31 GMT
< Etag: "229c5df55965674706e3ebfbaa3ae0c4-ssl-df"
< Strict-Transport-Security: max-age=31536000
< Content-Encoding: gzip
< Content-Length: 2460
< Age: 100652
< Connection: keep-alive
< Server: Netlify
< Vary: Accept-Encoding
< X-NF-Request-ID: 9749070f-201f-451e-8c75-f394c71a3ea4-12040280
<
* Connection #0 to host scottchamberlain.info left intact
So verbose output is definitely full of info including what I want. That is pretty nice actually. I mean it gets a little out of control with a Paginated request in terms of the volume of output. But in terms of this exact use case it is more than sufficient. We can close this unless you intend to implement something a little less verbose.
Thanks!
I wanted to see if you were aware of curl options (in particular the verbose
option) - BUT, it is very verbose, and is a lot more information than just the URL, so:
I'll try a function to get full URLs before the request is made - however, just realized an issue with the full url is that any addiitional url paths, and the query params are passed in to the HTTP verb function calls (e.g., get
), so we don't have all the information needed to construct URLs anyway before the HTTP verb function is called
Yeah thank you for pointing the curl options. That was a đź’ˇ for me.
I think URLs after the call are still useful given it is likely to be challenging to get them before though I guess if the HTTP verb calls fails you won't know what was even tried.
@boshek can you reinstall? see https://github.com/ropensci/crul/blob/master/R/paginator.R#L116-L118
Yep this is exactly it. Works for me for both HttpClient and Paginator. Thanks @sckott
cool, glad it works. need to add some tests and such still to make sure it's working as expected
I am trying to see if I can extract the url(s) for the GET request from a paginated request. I think the reprex below illustrates the question:
Created on 2018-09-17 by the reprex package (v0.2.1)
So then I would like to use pagination:
But if I request the link I only get the base url:
The API itself does provide the urls so this is an example of what I am after but I would like to get these before they head off via the GET request:
Created on 2018-09-17 by the reprex package (v0.2.1)
So my question: Is there anyway in
crul
to get the urls for GET request much like an unpaginated request?Session Info
```r Session info --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- setting value version R version 3.5.1 (2018-07-02) system x86_64, mingw32 ui RStudio (1.2.992) language (EN) collate English_Canada.1252 tz America/Los_Angeles date 2018-09-17 Packages ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- package * version date source assertthat 0.2.0 2017-04-11 CRAN (R 3.5.1) backports 1.1.2 2017-12-13 CRAN (R 3.5.0) base * 3.5.1 2018-07-02 local base64enc 0.1-3 2015-07-28 CRAN (R 3.5.0) callr 3.0.0 2018-08-24 CRAN (R 3.5.1) clipr 0.4.1 2018-06-23 CRAN (R 3.5.1) compiler 3.5.1 2018-07-02 local crayon 1.3.4 2017-09-16 CRAN (R 3.5.1) crul * 0.6.0 2018-07-10 CRAN (R 3.5.1) curl 3.2 2018-03-28 CRAN (R 3.5.1) datasets * 3.5.1 2018-07-02 local devtools * 1.13.6 2018-06-27 CRAN (R 3.5.1) digest 0.6.17 2018-09-12 CRAN (R 3.5.1) evaluate 0.11 2018-07-17 CRAN (R 3.5.1) fs 1.2.6 2018-08-23 CRAN (R 3.5.1) glue 1.3.0 2018-09-04 Github (tidyverse/glue@4e74901) graphics * 3.5.1 2018-07-02 local grDevices * 3.5.1 2018-07-02 local htmltools 0.3.6 2017-04-28 CRAN (R 3.5.1) httpcode 0.2.0 2016-11-14 CRAN (R 3.5.0) jsonlite 1.5 2017-06-01 CRAN (R 3.5.1) knitr 1.20 2018-02-20 CRAN (R 3.5.1) lobstr * 0.0.0.9000 2018-07-20 Github (r-lib/lobstr@a80d8f8) magrittr 1.5 2014-11-22 CRAN (R 3.5.1) memoise 1.1.0 2017-04-21 CRAN (R 3.5.1) methods * 3.5.1 2018-07-02 local processx 3.2.0 2018-08-16 CRAN (R 3.5.1) ps 1.1.0 2018-08-10 CRAN (R 3.5.1) R6 2.2.2 2017-06-17 CRAN (R 3.5.1) Rcpp 0.12.18 2018-07-23 CRAN (R 3.5.1) reprex 0.2.1 2018-09-16 CRAN (R 3.5.1) rlang 0.2.2 2018-08-16 CRAN (R 3.5.1) rmarkdown 1.10 2018-06-11 CRAN (R 3.5.1) rprojroot 1.3-2 2018-01-03 CRAN (R 3.5.1) rstudioapi 0.7 2017-09-07 CRAN (R 3.5.1) stats * 3.5.1 2018-07-02 local stringi 1.2.4 2018-07-20 CRAN (R 3.5.1) stringr 1.3.1 2018-05-10 CRAN (R 3.5.1) testthat * 2.0.0 2017-12-13 CRAN (R 3.5.1) tools 3.5.1 2018-07-02 local triebeard 0.3.0 2016-08-04 CRAN (R 3.5.1) urltools 1.7.1 2018-08-03 CRAN (R 3.5.1) usethis * 1.4.0 2018-08-14 CRAN (R 3.5.1) utils * 3.5.1 2018-07-02 local whisker 0.3-2 2013-04-28 CRAN (R 3.5.1) withr 2.1.2 2018-03-15 CRAN (R 3.5.1) ```