r-lib / gh

Minimalistic GitHub API client in R
https://gh.r-lib.org
Other
223 stars 52 forks source link

Hard to work around httr2 bug re: user agent #188

Closed MichaelChirico closed 10 months ago

MichaelChirico commented 10 months ago

As noted in https://github.com/r-lib/httr2/issues/416, system curl may not have R-friendly version numbering, which is implicitly assumed in httr2 v1.0.0 (current CRAN).

That's fixed in dev but not yet released. In the meantime, I'm still hitting that error transiently from a call to gh():

traceback()
12: stop(gettextf("invalid version specification %s", paste(sQuote(unique(x[!ok])), 
        collapse = ", ")), call. = FALSE, domain = NA)
11: .make_numeric_version(x, strict, .standard_regexps()$valid_numeric_version)
10: numeric_version(x)
9: FUN(X[[i]], ...)
8: lapply(list(...), as.numeric_version)
7: c.numeric_version(httr2 = utils::packageVersion("httr2"), `r-curl` = utils::packageVersion("curl"), 
       libcurl = curl::curl_version()$version)
6: c(httr2 = utils::packageVersion("httr2"), `r-curl` = utils::packageVersion("curl"), 
       libcurl = curl::curl_version()$version)
5: req_user_agent(req)
4: req_handle(req)
3: httr2::req_perform(req, path = x$dest)
2: gh_make_request(req)
1: gh(sprintf(issue_query_fmt, user, repo, page))

My debugging skills may be failing me, but AFAICT there's no easy way for me to work around this issue.

Would it be possible to expose a way to set req_user_agent manually in the call to gh(), probably through params=?

AIUI support is currently limited to setting these parameters manually:

https://github.com/r-lib/gh/blob/ee2dbed977d71f2a3bae6d118d6a1a597d9575d2/R/gh_request.R#L38-L43

MichaelChirico commented 10 months ago

For any future readers, here's a workaround (hack the {httr2} environment to overwrite the bad definition of req_user_agent with a good one):

library(httr2)
library(purrr)
library(rlang)

source("https://raw.githubusercontent.com/r-lib/httr2/f7c3fee04c120d0d84c6ad1fd169fa822c6dfdff/R/req.R")
source("https://raw.githubusercontent.com/r-lib/httr2/f7c3fee04c120d0d84c6ad1fd169fa822c6dfdff/R/req-options.R")
source("https://raw.githubusercontent.com/r-lib/httr2/f7c3fee04c120d0d84c6ad1fd169fa822c6dfdff/R/utils.R")

httr2 <- asNamespace("httr2")
unlockBinding("req_user_agent", httr2)
httr2$req_user_agent = req_user_agent
lockBinding("req_user_agent", httr2)
gaborcsardi commented 10 months ago

Can't you specify the User-Agent header with the .send_headers argument?

MichaelChirico commented 10 months ago

Thanks... serves me right for not understanding curl very well :)

https://github.com/r-lib/gh/blob/ee2dbed977d71f2a3bae6d118d6a1a597d9575d2/R/gh.R#L59-L62

This seems to imply gh() is meant to supply its own User-Agent, though, which doesn't seem to be happening ({httr2} is setting the default value which triggers my error), am I mis-reading?

MichaelChirico commented 10 months ago

Actually, I tried using .send_headers to no avail. It looks like {httr2} does things in the wrong order for this to work properly, IIUC.

I tried gh(..., .send_headers = c('User-Agent' = 'my agent')) and got the same error. Here's the issue:

https://github.com/r-lib/httr2/blob/f7c3fee04c120d0d84c6ad1fd169fa822c6dfdff/R/req-perform.R#L275-L280

I.e., {httr2} is running req_user_agent() before handle_setheaders(), because the request doesn't have the useragent option set.

MichaelChirico commented 10 months ago

Glancing around, maybe we need to run httr2::req_options(req, useragent = ) if User-Agent is found in .send_headers()?

https://github.com/r-lib/gh/blob/ee2dbed977d71f2a3bae6d118d6a1a597d9575d2/R/gh.R#L276-L284

Or maybe this is an oversight in {httr2}?

gaborcsardi commented 10 months ago

httr2 sets the user agent properly for me, so I don't really understand why .send_headers would not be working.

library(httr2)
req <- request("https://httpbin.org/headers")
req <- req_headers(req, "User-Agent" = "My user agent")

resp <- req_perform(req)
resp_body_json(resp)
#> $headers
#> $headers$Accept
#> [1] "*/*"
#> 
#> $headers$`Accept-Encoding`
#> [1] "deflate, gzip"
#> 
#> $headers$Host
#> [1] "httpbin.org"
#> 
#> $headers$`User-Agent`
#> [1] "My user agent"
#> 
#> $headers$`X-Amzn-Trace-Id`
#> [1] "Root=1-6583d85b-407ff485372b2a9445fa7b4f"

Created on 2023-12-21 with reprex v2.0.2

MichaelChirico commented 10 months ago

Try that with a trace() on httr2::req_user_agent(). My guess is that req_user_agent() is being run, the default value is returned, but then req_perform() reconciles the manual headers supplied via req_headers() with the default headers supplied by req_user_agent().

Whereas in my case, there's no chance for reconciliation because req_user_agent() fails by default.

Put another way, your example works because your curl system version looks like an R version. You could also try shimming out curl::curl_version() to return something like "test" and see if your example keeps working.

gaborcsardi commented 10 months ago

Whereas in my case, there's no chance for reconciliation because req_user_agent() fails by default.

OK, got it. But if this is already fixed in httr2, then AFAICT gh does not need any changes.

MichaelChirico commented 10 months ago

"need", no. But while {httr2} is not update on CRAN it still requires the ugly workaround in the first comment. But feel free to close if it's not worth accommodating.

gaborcsardi commented 10 months ago

True, but if we work around this in gh that also needs a cran update, no? So you are trading one update for another one, while introducing unnecessary complexity here.

MichaelChirico commented 10 months ago

Fair enough! I think I had in mind future-proofing {gh} against any sort of failure like this going forward. But probably just feature creep for what is likely a really isolated/obscure issue.

Thanks for the consideration.