Feature: Include Embedded Data in Survey Questions

drsjp31 commented 3 years ago

I am looking to fetch just embedded data from a survey. fetch_survey does included the embedded data when it's pulled as a whole, but it doesn't recognize the embedded data column names in the include_questions argument. I tried the code below (where survey_ID is a string) and received Error in qualtrics_response_codes(res) : Qualtrics API raised a bad request (400) error. If I remove the include_questions argument, it runs properly and EventCode is included as a column in the data output.

fetch_survey(surveyID = survey_ID, force_request = T, include_display_order = F, label = F, convert = F, include_questions = c("EventCode"))

Using the metadata function, I can see EventCode as an embedded data variable, but when I run survey_questions(survey_ID) it only returns the actual questions in the survey, which I assume is linked to the include_questions argument. I'm looking for a feature that would allow include_questions to also refer to the embedded data columns so I can extract what I want versus having to pull the entire data set first.

juliasilge commented 3 years ago

I just spent a little bit of time on this and it looks to me on an initial glance that the Qualtrics API won't send back just embedded data at the endpoint we are using; I believe you'll need to query the API (via this R package) with at least one survey question.

I'll keep this open in case we find a different way to approach the problem. Thanks for the feedback! 🙌

jmobrien commented 3 years ago

@juliasilge fiddling around, I stumbled on using include_questions = character() to leave out all questions. So, that works, but I thought passing NA made more sense for the UI, so I did that real quick in draft PR #226.

Hope that helps, but it still needs tests, and I'm about to dive back into some other work for a while (days, maybe). So, if anyone wants to add the tests to finish that PR I'd be delighted! Otherwise I'll come back to it when I can.

@drsjp31, for now, I think you can do the below to leave off all questions, while just keeping (all) embedded data vars:

fetch_survey(
   surveyID = survey_ID,
   force_request = T,
   include_display_order = F,
   label = F,
   convert = F,
   include_questions = character()
)

juliasilge commented 3 years ago

Ah nice, that does work for me! Only non-question columns:

library(qualtRics)
rs <- fetch_survey("SV_56icaa9YAafpAqx", include_questions = character())
#>   |                                                                              |                                                                      |   0%  |                                                                              |======================================================================| 100%
#> 
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#>   StartDate = col_datetime(format = ""),
#>   EndDate = col_datetime(format = ""),
#>   Status = col_character(),
#>   Progress = col_double(),
#>   `Duration (in seconds)` = col_double(),
#>   Finished = col_logical(),
#>   RecordedDate = col_datetime(format = ""),
#>   ResponseId = col_character(),
#>   DistributionChannel = col_character(),
#>   UserLanguage = col_character(),
#>   SolutionRevision = col_double(),
#>   `Q3.8 - Parent Topics` = col_logical(),
#>   `Q3.8 - Sentiment Polarity` = col_double(),
#>   `Q3.8 - Sentiment Score` = col_double(),
#>   `Q3.8 - Sentiment` = col_character(),
#>   `Q3.8 - Topic Sentiment Label` = col_logical(),
#>   `Q3.8 - Topic Sentiment Score` = col_logical(),
#>   `Q3.8 - Topics` = col_character()
#> )
#> 'StartDate', 'EndDate', and 'RecordedDate' were converted without a specific timezone
#> * To set a timezone, visit https://www.qualtrics.com/support/survey-platform/managing-your-account/
#> * Timezone information is under 'User Settings'
#> * See https://api.qualtrics.com/instructions/docs/Instructions/dates-and-times.md for more
names(rs)
#>  [1] "StartDate"                    "EndDate"                     
#>  [3] "Status"                       "Progress"                    
#>  [5] "Duration (in seconds)"        "Finished"                    
#>  [7] "RecordedDate"                 "ResponseId"                  
#>  [9] "DistributionChannel"          "UserLanguage"                
#> [11] "SolutionRevision"             "Q3.8 - Parent Topics"        
#> [13] "Q3.8 - Sentiment Polarity"    "Q3.8 - Sentiment Score"      
#> [15] "Q3.8 - Sentiment"             "Q3.8 - Topic Sentiment Label"
#> [17] "Q3.8 - Topic Sentiment Score" "Q3.8 - Topics"

^{Created on 2021-08-22 by the reprex package (v2.0.1)}

juliasilge commented 2 years ago

Closed in #263 thanks to @jmobrien 🚀

ropensci / qualtRics

Feature: Include Embedded Data in Survey Questions #223