ropensci / qualtRics

Download ⬇️ Qualtrics survey data directly into R!
https://docs.ropensci.org/qualtRics
Other
215 stars 70 forks source link

Problems with type of `unsubscribed` (logical vs. character) #304

Closed saudiwin closed 1 year ago

saudiwin commented 1 year ago

An unfortunate new change as of purrr v1.0:

In map_chr(), automatic conversion from logical, integer, and double to character is now deprecated. Use an explicit as.character() if needed (#904).

This is causing an error in the fetch_mailinglist function:

<error/purrr_error_indexed>
Error in `purrr::map_lgl()`:
ℹ In index: 1.
Caused by error:
! Can't coerce from a character vector to a logical vector.
---
Backtrace:
 1. qualtRics::fetch_mailinglist(mailId)
 5. purrr::map_lgl(elements, "unsubscribed", .default = NA)
 6. purrr:::map_("logical", .x, .f, ..., .progress = .progress)
 9. purrr:::call_with_cleanup(...)
saudiwin commented 1 year ago

v0.3.4 or v0.3.5 seem fine, this came in v1.0.0

jmobrien commented 1 year ago

Thanks for the input! Yes, this is something we'll need to deal with, likely in the pending functions in #275.

jmobrien commented 1 year ago

After looking into this, this appears to be a different issue?

The problem here is with map_lgl() rather than map_chr(), which didn't seem to have had the same changes to coercion behavior. So, the "unsubscribed" column is somehow receiving a character vector.

And I'm not sure why. "unsubscribed" is a built-in variable from Qualtrics that should always download as TRUE/FALSE. I tried to get it to do otherwise but was unsuccessful. I can generate that error by manually intervening mid-function to change an "unsubscribed" elements to, say, "Y", but I haven't seen it natively. Is there anything about your lists that might be relevant to this?

Still, what you're describing seems identical to what @benhmin mentioned in #275, so it appears to be possible. @benhmin did note that those other in-development functions (using the newer API endpoint) seemed to be working.

saudiwin commented 1 year ago

Yes, sorry, I was wrong on the error diagnosis (reverted to older version of purrr but same issue). Inspecting the function, yes, for some reason the vector is coming through character:

$ unsubscribed         : chr "false"

This is inside the fetch_mailinglist function. I don't get this error on my Mac laptop; it's in a Digital Ocean droplet I set up. Here is the sessionInfo:

R version 4.2.2 Patched (2022-11-10 r83330)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.5 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
 [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] qualtRics_3.1.7

loaded via a namespace (and not attached):
 [1] magrittr_2.0.3   hms_1.1.2        tidyselect_1.2.0 insight_0.18.8  
 [5] timechange_0.1.1 R6_2.5.1         rlang_1.0.6      fansi_1.0.3     
 [9] httr_1.4.4       stringr_1.5.0    dplyr_1.0.10     tools_4.2.2     
[13] utf8_1.2.2       cli_3.5.0        DBI_1.1.3        ellipsis_0.3.2  
[17] assertthat_0.2.1 tibble_3.1.8     lifecycle_1.0.3  tidyr_1.2.1     
[21] purrr_0.3.4      readr_2.1.3      tzdb_0.3.0       vctrs_0.5.1     
[25] curl_4.3.3       sjlabelled_1.2.0 glue_1.6.2       stringi_1.7.8   
[29] compiler_4.2.2   pillar_1.8.1     generics_0.1.3   jsonlite_1.8.4  
[33] lubridate_1.9.0  pkgconfig_2.0.3 

Could also be something related to curl?

saudiwin commented 1 year ago

I'm probably going to fix this for myself for the time being by forking the project and adding an as.logical coercion, but I'll let you know if I can identify the root cause.

saudiwin commented 1 year ago

Specifically, I changed the code to the following, which makes the error go away:

unsubscribed = as.logical(purrr::map_chr(elements, "unsubscribed", .default = NA_character_)))

However, I don't know why the vector came through as character originally, so obviously not a good long-term fix.

jmobrien commented 1 year ago

That would work for your case, except (kind of amusingly) it now makes the purrr 1.0 change relevant again--it will throw a warning (for now) in normal cases because map_chr() will be trying to convert a logical.

As an alternative, I added this line just before the call to tibble():

  elements <-
    purrr::map(elements, ~purrr::modify_in(.x, "unsubscribed", as.logical))

which takes advantage of as.logical being more permissive than map_lgl about the meaning of lowercase "true" or "false", converting them in advance.

I think that should work while preserving standard function, so went ahead and made a PR for it. It's a minor kluge, though, so let me run it by @juliasilge first.

jmobrien commented 1 year ago

As for why you're getting that, I don't know. The two relevant functions (deep inside the API request) seem to be httr:::parse_text() and jsonlite::fromJSON(), which, respectively, convert the API result to a string (from binary) and then parse the string (as JSON).

Looking at my test mailinglist, I got this from debugonce(httr:::parse_text()) :

"{\"result\":{\"elements\":[{\"id\":\"MLRP_71HUvIwwYAFyJDg\",\"firstName\":\"test1\",\"lastName\":\"test1\",\"email\":\"test1@test.com\",\"externalDataReference\":\"1\",\"embeddedData\":null,\"language\":\"null\",\"unsubscribed\":false,\"responseHistory\":[],\"emailHistory\":[]},{\"id\":\"MLRP_3g95Ibu9YWUTurI\",\"firstName\":\"test2\",\"lastName\":\"test2\",\"email\":\"test2@test.com\",\"externalDataReference\":\"2\",\"embeddedData\":null,\"language\":\"null\",\"unsubscribed\":false,\"responseHistory\":[],\"emailHistory\":[]},{\"id\":\"MLRP_blxRKsifuS2gYL4\",\"firstName\":\"test3\",\"lastName\":\"test3\",\"email\":\"test3@test.com\",\"externalDataReference\":\"3\",\"embeddedData\":null,\"language\":\"null\",\"unsubscribed\":true,\"responseHistory\":[],\"emailHistory\":[]},{\"id\":\"MLRP_6ssleCfefL6DcRE\",\"firstName\":\"test4\",\"lastName\":\"test4\",\"email\":\"test4@test.com\",\"externalDataReference\":\"4\",\"embeddedData\":null,\"language\":\"null\",\"unsubscribed\":true,\"responseHistory\":[],\"emailHistory\":[]},{\"id\":\"MLRP_3F9eSZcDSemfaRg\",\"firstName\":\"test5\",\"lastName\":\"test5\",\"email\":\"test5@test.com\",\"externalDataReference\":\"5\",\"embeddedData\":null,\"language\":\"null\",\"unsubscribed\":true,\"responseHistory\":[],\"emailHistory\":[]}],\"nextPage\":null},\"meta\":{\"httpStatus\":\"200 - OK\",\"requestId\":\"08b774ef-9c12-45d4-bfc5-30f743ca6273\"}}"

The first element looks like this, with "unsubscribed": false unquoted:

{
  "result": {
    "elements": [
      {
        "id": "MLRP_71HUvIwwYAFyJDg",
        "firstName": "test1",
        "lastName": "test1",
        "email": "test1@test.com",
        "externalDataReference": "1",
        "embeddedData": null,
        "language": "null",
        "unsubscribed": false,
        "responseHistory": [],
        "emailHistory": []
      }
   ]
}

If you're getting something different, then it's either there or upstream of there (e.g., in curl), and if not it's likely something with jsonlite.

Might not matter if this fix works, but bringing it up anyway in case you're curious, plus it might be good to know whether we'll need to anticipate this occasional behavior for the planned update that will replace this function.

juliasilge commented 1 year ago

I believe this is fixed now by #306. Thanks @jmobrien! 🙌

You can install to get this fix via devtools::install_github("ropensci/qualtRics"). If you can try this out and see if you have further problems, that would be very helpful!