ropensci / ruODK

ruODK: An R Client for the ODK Central API
https://docs.ropensci.org/ruODK/
GNU General Public License v3.0
42 stars 13 forks source link

implicit coersion from integer to character is being deprecated in purr/tidyverse #144

Closed ikardail closed 1 year ago

ikardail commented 1 year ago

Problem

ruODK function(s) used

Unexpected behaviour

Reproducible example

function(forms) {
  nf = nrow(forms)
  pb <- progress_bar$new(total = 2 * nf)
  dat <- rowwise(forms) %>%
    mutate(
      # this extracts submission's metadata
      ss = {
        pb$tick()
        list(odata_submission_get(table = 'Submissions', fid = fid) %>% 
               # submissions are imported in reverse chronological order
               mutate(order = n():1L))
      },
      # this extracts botanal repeats
      sb = {
        pb$tick()
        list(odata_submission_get(table = 'Submissions.botanal_repeat', fid = fid) %>%
               group_by(submissions_id) %>% 
               mutate(quad = n():1L))
      }) %>% 
    ungroup()
  dat %>% 
    mutate(dat = map2(ss, sb, 
                      ~ rename(.x, submissions_id = id) %>% 
                        left_join(.y, by = "submissions_id") %>% 
                        mutate(
                          pers = system_submitter_name,
                          today = as_date(today)))) %>% 
    select(name, fid, dat)
  }
Session Info ODK Central version: ```{r} # utils::sessionInfo() ```R version 4.2.2 (2022-10-31 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19044) Matrix products: default locale: [1] LC_COLLATE=English_Australia.utf8 LC_CTYPE=English_Australia.utf8 LC_MONETARY=English_Australia.utf8 [4] LC_NUMERIC=C LC_TIME=English_Australia.utf8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] ssh_0.8.2 progress_1.2.2 xgboost_1.6.0.1 data.table_1.14.6 sf_1.0-9 ruODK_0.10.2.9000 [7] lubridate_1.9.0 timechange_0.1.1 forcats_0.5.2 stringr_1.5.0 dplyr_1.0.10 purrr_1.0.0 [13] readr_2.1.3 tidyr_1.2.1 tibble_3.1.8 ggplot2_3.4.0 tidyverse_1.3.2 loaded via a namespace (and not attached): [1] Rcpp_1.0.9 lattice_0.20-45 prettyunits_1.1.1 class_7.3-20 assertthat_0.2.1 [6] utf8_1.2.2 R6_2.5.1 cellranger_1.1.0 sys_3.4.1 backports_1.4.1 [11] reprex_2.0.2 e1071_1.7-12 httr_1.4.4 pillar_1.8.1 rlang_1.0.6 [16] curl_4.3.3 googlesheets4_1.0.1 readxl_1.4.1 rstudioapi_0.14 Matrix_1.5-1 [21] googledrive_2.0.0 munsell_0.5.0 proxy_0.4-27 broom_1.0.2 janitor_2.1.0 [26] compiler_4.2.2 modelr_0.1.10 pkgconfig_2.0.3 askpass_1.1 openssl_2.0.5 [31] tidyselect_1.2.0 fansi_1.0.3 crayon_1.5.2 tzdb_0.3.0 dbplyr_2.2.1 [36] withr_2.5.0 grid_4.2.2 jsonlite_1.8.4 gtable_0.3.1 lifecycle_1.0.3 [41] DBI_1.1.3 magrittr_2.0.3 units_0.8-1 credentials_1.3.2 scales_1.2.1 [46] KernSmooth_2.23-20 cli_3.4.1 stringi_1.7.8 fs_1.5.2 snakecase_0.11.0 [51] xml2_1.3.3 ellipsis_0.3.2 generics_0.1.3 vctrs_0.5.1 tools_4.2.2 [56] glue_1.6.2 hms_1.1.2 colorspace_2.0-3 gargle_1.2.1 classInt_0.4-8 [61] rvest_1.0.3 haven_2.5.1
florianm commented 1 year ago

Thanks for the reprex! What line exactly raises the error? This is hard for me to reproduce as I don't have access to your server. Is the problem the variable type between id and submissions_id at left_join(.y, by = "submissions_id")?

As a workaround (also what I do in my downstream pipelines, you could mutate the IDs to character where needed to avoid the warning. I'll have a look whether ruODK can easily convert both to character.

ikardail commented 1 year ago

Thanks for a reply. AFAIK, my arguments are already type character. The problem is downstream within ruODK::odata_submission_get function. But neither my function, nor the ruODK::odata_submission_get uses map_chr that is compaining, so it must be in some lower-level ones it calls. Haven't drilled that far into the package code.

It is going to be hard to re-create a reprex even for me. Guess we'll have to wait till tidyverse does the lifecycle, and the problem errors out, rather than giving a warning every 8 hrs...

ikardail commented 1 year ago

Sorry I've messed up my latex formatting when creating this issue, you have to read it in plain text

florianm commented 1 year ago

It would be great to drill down a bit further to troubleshoot this!

Could you do something like this for just one form (dropping the map2):

# Set defaults to one specific form:
ruODK::ru_setup(pid=x, fid=y)

# Get the main submissions plus one repeat
tbl_main <- ruODK::odata__submission_get(table = 'Submissions')
tbl_sub <- ruODK::odata__submission_get(table = 'Submissions.botanal_repeat')

# Attempt to join by shared ID
tbl_both <- tbl_main |>
  dplyr::left_join(tbl_sub, by=c("id"="submissions_id")) # this line could throw the error

# Are both ID colums of the same class? char vs integer?
class(tbl_main$id) == class(tbl_sub$submisions_id) 

The goal is to isolate the one line that throws the error for you, and capture the error message (paste here).

ikardail commented 1 year ago

Florian

Thanks for a tip, and apology for the delay, only just got time to revisit.

Unfortunately, the problem isn't there. The join works without a warning, and both values are of class "character".

On Thu, 2 Feb 2023 at 13:56, Florian Mayer @.***> wrote:

It would be great to drill down a bit further to troubleshoot this!

Could you do something like this for just one form (dropping the map2):

Set defaults to one specific form:

ruODK::ru_setup(pid=x, fid=y)

tbl_main <- ruODK::odatasubmission_get(table = 'Submissions') tbl_sub <- ruODK::odatasubmission_get(table = 'Submissions.botanal_repeat')

tbl_both <- tbl_main |> dplyr::left_join(tbl_sub, by=c("id"="submissions_id")) # this line could throw the error

class(tbl_main$id) == class(tbl_sub$submisions_id) # are both the same class?

— Reply to this email directly, view it on GitHub https://github.com/ropensci/ruODK/issues/144#issuecomment-1413078949, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADDFGZP6DA53K6EGDHKAAOLWVMO4DANCNFSM6AAAAAATWMTDPY . You are receiving this because you authored the thread.Message ID: @.***>

-- Dr. Igor Kardailsky LinkedIn https://www.linkedin.com/in/igor-kardailsky-4276a03/ Google Scholar https://scholar.google.com/citations?user=mt0RhpcAAAAJ&hl=en twitter https://twitter.com/ikardail +61 4 120 41458

florianm commented 1 year ago

Awesome, thanka for confirming that. I'll close this issue seeing that the problem seems to be fixed.

ikardail commented 1 year ago

Florian, you may have misunderstood, the warning persists, but not in the code we tested. While playing with breakpoints in my function, I figured the problem is in the form_list() that I call with defaults after ruodk_setupis run. Here's that function:

form_list <- function (pid = get_default_pid(), url = get_default_url(), un = get_default_un(), 
    pw = get_default_pw(), retries = get_retries()) 
{
    yell_if_missing(url, un, pw, pid = pid)
    httr::RETRY("GET", httr::modify_url(url, path = glue::glue("v1/projects/{pid}/forms")), 
        httr::add_headers(Accept = "application/xml", `X-Extended-Metadata` = "true"), 
        httr::authenticate(un, pw), times = retries) %>% yell_if_error(., 
        url, un, pw) %>% httr::content(.) %>% {
        tibble::tibble(name = purrr::map_chr(., "name"), fid = purrr::map_chr(., 
            "xmlFormId"), version = purrr::map_chr(., "version", 
            .default = NA), state = purrr::map_chr(., "state"), 
            submissions = purrr::map_chr(., "submissions"), created_at = purrr::map_chr(., 
                "createdAt", .default = NA) %>% isodt_to_local(), 
            created_by_id = purrr::map_int(., c("createdBy", 
                "id")), created_by = purrr::map_chr(., c("createdBy", 
                "displayName")), updated_at = purrr::map_chr(., 
                "updatedAt", .default = NA) %>% isodt_to_local(), 
            published_at = purrr::map_chr(., "publishedAt", .default = NA) %>% 
                isodt_to_local(), last_submission = purrr::map_chr(., 
                "lastSubmission", .default = NA) %>% isodt_to_local(), 
            hash = purrr::map_chr(., "hash", .default = NA))
    }
}
<bytecode: 0x00000246b0597828>
<environment: namespace:ruODK>

and one of those default args ends up being an integer by the time it makes it to a map_chr

I suggest we reopen and investigate the issue further