As user, I would like to optimise the data collection process

The goal of this ticked is to develop the back office function & logic behind the data collection step

The module script is here: https://github.com/unhcr-americas/surveyDesigner/blob/main/R/mod_collection.R

From the previous stage, we have used the filters based on indicator selection and language (language label used for the country) to subset a list of questions and potential answers (if select_one or select_multiple).

In the collection stage, we need to assess if the total questionnaire should be split in different parts, aka data collection waves using :

interview duration (_each questions within the form can be assessed with the interview duration function_),
questions groups (aka a module of questions grouped between 'begin_group' and 'end_group'), and
on indicator requirement (_aka, based on the mapping, multiple questions potentially spread over multiple modules together with linked questions required for indicator disaggregation_).
data collection mode as it impact the sequence of the questions (in CAPI, sensitive questions being more kept at the end, while it is the contrary for CATI)
an estimation of the response rate based on average interview duration. As the longer is a survey, the higher is the risk of dropout, the impact of the designing long survey be can be estimated by the cost of reaching out people whose information will not be recorded. (basically total cost per interview would be a function of response rate). See this publication - Optimizing Data Collection Interventions to Balance Cost and Quality in a Sequential Multimode Survey

Also we shall estimates an operational budget, based on costing input (aka enumeration capacity and total cost per interview) and various respondant sample size threshold (500, 1000, 5000).

This should be done by assessing the current decision input and simulating the results of other alternatives.

Client - Validation

[ ] Specification of required input data to smartly split a too-lengthy questionnaire...
[ ] Output one or many split questionnaire (one per wave and data collection mode..)
[ ] Output a summary of has been done - and what could be done for what advantage

Dev - Tech

[ ] Might need to rework the current input data in order to build the use case
[ ] The output should suggest some adjustment parameters.. (increase the data collection waves.. ) with recommendations with projected budgets per scenario
[ ] Technical validation (tests, check etc.)

unhcr-americas / surveyDesigner

As user, I would like to optimise the data collection process #27

Client - Validation

Dev - Tech