Design a standard format to aggregate survey responses with a custom function

Watts-Lab / surveys

Library of surveys for deliberation experiments

MIT License

3 stars 4 forks source link

Design a standard format to aggregate survey responses with a custom function #28

Closed JamesPHoughton closed 2 years ago

JamesPHoughton commented 2 years ago

In addition to the survey questions, most survey instruments also include an aggregation procedure that turns the results of a survey into a score. For example, the Cognitive Reflection Test (https://github.com/Watts-Lab/surveys/issues/15) poses a number of problems and then reports a single metric for how well the subject performed.

If this repository is to serve as a canonical implementation of a variety of different survey instruments, it also needs to include those aggregation functions, so that everyone who uses the surveys will assess them in the same way.

Implementation

I don't know if it is possible to include the aggregation function as an additional component of the .json file, but if it is, that might be nice. We might stringify the function definition and extract it again in the wrapper?
Another approach would be to include an additional js file in the survey directory which contains the aggregation function. This would preserve the integrity of the main .json file as a pure set of questions.

@markwhiting, what are your thoughts?

markwhiting commented 2 years ago

Not sure the best implementation given survey.js either, but I do think its worth setting a standard for this that is reasonably general.

The key idea I would lean on is for the way we use surveys to allow us to store each survey response variable plus an aggregated score when such a thing exists. And that aggregation should be calculated based on the standard of the literature in which that survey is used.

I personally think that dealing with scaling is also useful, for example the Viability scale is on a 14–70 scale, should we adjust that to 0–1? Further some CRT are on 0–3, some 0–6 and some 0–7. Should we adjust those all to 0–1? Or should we store both a normalized and a non normalized version?

I don't have a strong preference here, however, in our panel we have chosen to store normalized variables wherever possible because there are so many different instruments and scales, so I guess I have a mild preference for that approach. Notably, in papers it is sometimes marginally preferred to report the raw scores, e.g., 14–70 instead of 0–1.

JamesPHoughton commented 2 years ago

What about putting a single .js file in each survey directory with a single default export function. The function would take the direct output of the SurveyJS (a dictionary, I believe @Alan-Qiao?) and add to it a raw and normalized score. Then, when we use the SurveyWrapper component, part of the callback would be to get and load this function and run it on the raw output of SurveyJS before returning the full dictionary.

// score.js
// bad pseudocode...
import neededModule

export default function score(result) {
   const maxScore = 70
   const minScore = 14

   rawScore = sum(result.items)  // or something that actually works
   normScore = (rawScore - minScore)/(maxScore-minScore)

  result.update({rawScore: rawScore, normScore:normScore})

  return result
}

Alan-Qiao commented 2 years ago

Yes it is a dictionary. You get the survey results inside a callback hook that you attach to the onComplete function of the survey model. You can take a look at the SurveyWrapper.js for reference.

markwhiting commented 2 years ago

I think that works well. Notably, some surveys do not produce one score, e.g., big five produces a score of 5 different psychological attributes. So a design that passes out something like a named variable at a top level might be better than one that passes just score variables.

I could imagine an object like:

{survey_blob: "www.blob.com",
responses:{q1:"Not at all confident", q2:"Very confident", ...},
result: {agreeableness: 0, ...}}

(updated to include a blob URL in the structure)

markwhiting commented 2 years ago

Writing up https://github.com/Watts-Lab/individual-mapping/issues/121 I realized that another dimension here is the content of the codebook, and I wonder if its worth including that somewhere in the survey folder too. You can see an example of what this looks like for all our current surveyor surveys here → https://github.com/Watts-Lab/individual-mapping/blob/main/internal_data/survey_responses/internal_codebook.csv

I could imagine one way to do this would just be to have a csv in the survey folder that stores that information, or to include it in the response output object we specified above as a codebook object of some sort. I think it helps a lot to have a codebook, and doing it programmatically from here would mean that we don't need to make sure ours is up to date with yours in the future.

JamesPHoughton commented 2 years ago

I like the idea that the final output of the survey would look like mark's solution in https://github.com/Watts-Lab/surveys/issues/28#issuecomment-1144061278 I propose that when we have a survey titled example.json, and it has an associated documentation example.stories.mdx file, in the same directory we include a example.score.js file that would take the raw responses and compute a result.

// example.score.js
// aggregates the results of `example.json`

export default function scoreFunc( responses ){

   const maxScore = 70
   const minScore = 14

   rawScore = sum(responses.items)  // or something that actually works
   normScore = (rawScore - minScore)/(maxScore-minScore)
   result = {
      rawScore: rawScore,
      normScore: normScore
   };
   return result;
}

Then the survey tool itself would create the final record object, and append any other metadata that it finds interesting:

import scoreFunc from example.score.json

const function surveyTool(survey_blob, n_people=10) {
   // you know, something like surveyjs with a wrapper, or the surveyor tool
   const responses = conduct_survey(survey_blob)
   const result = scoreFunc(responses)

   record = {
       survey_blob: survey_blob,
       responses: responses,
       result: result,
       playerSource: "MTurk",
   };
   return record
}