ait-cs-IaaS / koord2ool

Koord2ool is an extension of LimeSurvey that visualizes responses to surveys over time.
GNU General Public License v3.0
3 stars 1 forks source link

Data validity: counting and expiration #23

Closed otmarlendl closed 1 year ago

otmarlendl commented 1 year ago

In a test survey, I have answers from 9 access tokens. A graph like this:

Screenshot 2023-02-27 133642

doesn't make sense, as the ring diagram needs to display the state of the world at the end of the time-slider.

Before starting to display data, a pass must be made over all survey results in order to get a timeline of the state of any answer.

What's needed is the following:

In SQL terms (illustrative only, this should all be in memory client-side), this could result in:

-- for which time-values do we need to know the state of the world? table timestamps ( ts timstamp, num_valid_responses int, )

-- a row for every single ts / responder combination table answers_question_N ( when ts, responder string, answer [depends on Q type] )

table state_question_N( when ts, [data aggregation from answers_question_N depending on Q type] )

The last row in state_question_N that is still inside the time-slider is used to generate the graph on left for this question.

The full table state_question_N is what directly drives the graph on the right.

More on data aggregation in another Issue.

otmarlendl commented 1 year ago

This of course only works if users are identified (closed survey). Thus it's necessary to set the right survey options in LimeSurvey.

b3n4kh commented 1 year ago

To solve this the data for the Linechart has to be changed to the following:

For each response for each question, there has to have be "artificial" points for every other possible user.

Example:

The Survey has 10 Questions and over the span from one month, 10 different users answered these questions 10 times.

  1. Create 100 responses out of the RPC endpoints filtered by time slider.
  2. Enrich with up to 100 additional data points for every "expiration point" of every response, if there isn't an answer from the same user already inside that range. These enriched points always have the value: "N/A"
  3. Enrich again now for each of these (now 200) data-points the current state for every other user has to be added, so in this case 2000 data points. The value is based on the "last" value of that user, no matter if it is an "enriched" or a normal response.

A test case validating at least these "numbers" should be written, since this feature will be very complex and prone to errors. Keeping in mind, the final 10-fold increase is static, whereas the expiration point increase is dynamic.