Closed achafetz closed 11 months ago
I tested on off AIDNET (joining the guest network) and still got the same page timed out error.
Looking into this. I think there have been some changes in the authentication flow.
Ok, found a solution:
Note: this means all pano_* function will need to use a progenerated token (instead of the regular user/pass). Also need to account for token expiration.
I tested out this AM.
#install dev branch
remotes::install_github("USAID-OHA-SI/grabr", ref = "develop")
#load
library(grabr)
url <- "https://pepfar-panorama.org/forms/downloads/"
#test 1 - can successfully create a session? --> SUCESSS
sess <- pano_session()
#test 2 - can successfully see items? --> FAIL
pano_items(url, sess)
#test 3 - can successfully extract most recent period folder? --> FAIL
url %>%
pano_content(session = sess) %>%
pano_elements() %>%
dplyr::filter(stringr::str_detect(item, "^MER")) %>%
dplyr::pull(item)
#reinstall prod version
rstudioapi::restartSession()
pak::pak("USAID-OHA-SI/grabr")
So while the session is being created, it doesn't seem that the credentials are being pass successully or something else is off. This is the error I get for tests 2 and 3.
Error in httr::content("text") : is.response(x) is not TRUE
The issue appears to be arising from here where session info is being pass into via cookies. https://github.com/USAID-OHA-SI/grabr/blob/aa53a63297779013d0a6a58bc9aa457247150bcc/R/extract_pano.R#L84-L86
I think the issue is within the pano_content()
One the of validation in the if statement is missing x = page:
pano_content <- function(page_url, session) {
page <- httr::GET(page_url, httr::set_cookies("formsSessionState" = session))
if (!base::is.null(page) & !is.null(httr::content(x = page, "text"))) {
page <- page %>%
httr::content("text") %>%
rvest::read_html()
} else {
base::stop("ERROR - Unable to extract page content")
}
return(page)
}
In coRps today, @jess-stephens was running
pano_session
and getting an error.Created on 2023-11-29 with reprex v2.0.2
@karishmas26 unpacked this, running through the steps and found the issue was arising from the session info. Note: this is not a credentials issue.
https://github.com/USAID-OHA-SI/grabr/blob/ee3d342072c5ff30cd48c8cfeac5835860ed7a2f/R/extract_pano.R#L36-L43
When running ln36-7, the
login_sess
value end up being"Page Timed Out"
, which is not what the script is then expecting in theif
statement in ln39. As a result, you get the error message above.The larger issue is the page timing out. I was able to run this function on Monday (on AIDNET) to download the new MSDs, but it is resulting in this error for all of us (JS, KS, and AC) today.
The secondary issue is we need to add in a better return error message back to user to provide the error from the site, not the error in the script.