ropensci / qualtRics

Download ⬇️ Qualtrics survey data directly into R!
https://docs.ropensci.org/qualtRics
Other
215 stars 70 forks source link

Error only when using include_questions: "Qualtrics API raised a bad request (400) error" #256

Closed asadow closed 2 years ago

asadow commented 2 years ago

I am using what I think are correect names in the vector for the include_questions argument. But I still get the Qualtrics API raised a bad request (400) error. This error does not occur when I omit the include_questions argument.

library(tidyverse)
library(qualtRics)
surveys <- all_surveys()
df <- surveys %>%
  mutate(
    data = map(
      id,
      ~ fetch_survey(
        .x,
        col_types = cols(.default = "c"),
        convert = FALSE,
        include_questions = c("StartDate"),
        force_request = TRUE
      ) 
    )
  )
#> Error: Problem with `mutate()` column `data`.
#> ℹ `data = map(...)`.
#> x Qualtrics API raised a bad request (400) error - Please report this on
#> https://github.com/ropensci/qualtRics/issues

df <- surveys %>%
  mutate(
    data = map(
      id,
      ~ fetch_survey(
        .x,
        col_types = cols(.default = "c"),
        convert = FALSE,
        #include_questions = c("StartDate")
        force_request = TRUE
      ) 
    )
  )

df$data[[1]]$StartDate

 [1] "2021-03-14 15:29:11" "2021-03-21 21:23:20" "2021-03-22 01:08:03"
 [4] "2021-04-05 17:43:16" "2021-05-05 02:19:47" "2021-05-07 19:03:34"
 [7] "2021-05-25 19:50:06" "2021-06-20 14:11:44" "2021-09-17 17:15:27"
[10] "2021-09-23 14:03:42" "2021-10-12 15:08:16" "2021-10-12 19:51:40"
[13] "2021-10-12 23:40:09" "2021-10-13 01:26:37"
attr(,"label")
   StartDate 
"Start Date"

Created on 2022-04-13 by the reprex package (v2.0.1)

juliasilge commented 2 years ago

The problem here is that StartDate isn't a question, as Qualtrics views it. You can find all the questions you have in a survey via column_map(), and any of those are questions you can use in include_questions:

library(qualtRics)

column_map("SV_3gbwq8aJgqPwQDP")
#> # A tibble: 6 × 4
#>   qname qid   choice textEntry
#>   <chr> <chr> <chr>  <chr>    
#> 1 Q63   QID63 <NA>   <NA>     
#> 2 Q16   QID16 <NA>   <NA>     
#> 3 Q17   QID17 <NA>   <NA>     
#> 4 Q18   QID18 <NA>   <NA>     
#> 5 Q19   QID19 <NA>   <NA>     
#> 6 Q22   QID22 <NA>   <NA>

fetch_survey("SV_3gbwq8aJgqPwQDP", include_questions = c("QID63"), force_request = TRUE)
#>   |                                                                              |                                                                      |   0%  |                                                                              |======================================================================| 100%
#> 
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#>   StartDate = col_datetime(format = ""),
#>   EndDate = col_datetime(format = ""),
#>   Status = col_character(),
#>   IPAddress = col_character(),
#>   Progress = col_double(),
#>   `Duration (in seconds)` = col_double(),
#>   Finished = col_logical(),
#>   RecordedDate = col_datetime(format = ""),
#>   ResponseId = col_character(),
#>   RecipientLastName = col_logical(),
#>   RecipientFirstName = col_logical(),
#>   RecipientEmail = col_logical(),
#>   ExternalReference = col_logical(),
#>   LocationLatitude = col_double(),
#>   LocationLongitude = col_double(),
#>   DistributionChannel = col_character(),
#>   UserLanguage = col_character(),
#>   Q63 = col_character(),
#>   SolutionRevision = col_double()
#> )
#> # A tibble: 26 × 19
#>    StartDate           EndDate             Status         IPAddress     Progress
#>    <dttm>              <dttm>              <chr>          <chr>            <dbl>
#>  1 2020-02-20 01:16:42 2020-02-20 01:17:19 Survey Preview <NA>               100
#>  2 2020-02-20 01:30:55 2020-02-20 01:34:37 IP Address     98.14.36.23        100
#>  3 2020-02-20 01:49:48 2020-02-20 01:50:23 IP Address     75.82.50.44        100
#>  4 2020-02-20 02:46:41 2020-02-20 02:47:01 IP Address     75.172.11.120      100
#>  5 2020-02-20 02:56:28 2020-02-20 02:59:15 IP Address     194.59.251.2…      100
#>  6 2020-02-20 12:22:10 2020-02-20 12:22:58 IP Address     66.168.182.1…      100
#>  7 2020-02-20 12:31:28 2020-02-20 12:32:06 IP Address     35.138.90.71       100
#>  8 2020-02-20 12:52:59 2020-02-20 12:53:26 IP Address     24.254.16.188      100
#>  9 2020-02-20 17:09:48 2020-02-20 17:10:12 IP Address     99.88.198.1        100
#> 10 2020-02-21 02:52:01 2020-02-21 02:52:02 Survey Test    <NA>               100
#> # … with 16 more rows, and 14 more variables: `Duration (in seconds)` <dbl>,
#> #   Finished <lgl>, RecordedDate <dttm>, ResponseId <chr>,
#> #   RecipientLastName <lgl>, RecipientFirstName <lgl>, RecipientEmail <lgl>,
#> #   ExternalReference <lgl>, LocationLatitude <dbl>, LocationLongitude <dbl>,
#> #   DistributionChannel <chr>, UserLanguage <chr>, Q63 <ord>,
#> #   SolutionRevision <dbl>

Created on 2022-04-14 by the reprex package (v2.0.1)

Notice that there are a lot of variables here like start and end date, user language, etc, that are included with my one single question QID63.

asadow commented 2 years ago

Thanks Julia. But how do I determine (before reading in the data) which qid corresponds to those variables (start date, end date, etc.)? I am trying to read in those exact variables in your example, but for many surveys, and it looks like these variables might be under different qid's respective to each survey.

Edit: from #223 include_questions = character() seems to work!

juliasilge commented 2 years ago

Sounds like you may have figured out what you need! 🙌

To clarify, the variables like start and end date don't have QIDs because they are not questions. You can use column_map() to get the survey column mapping (names of question, QID, etc).