Plotting Duration & Completion

SachaG commented 1 year ago

Charts by @ShaineRosewel

unnamed unnamed (1) unnamed (2)

SachaG commented 1 year ago

Things to look into:

Is duration missing from the dataset?
Should documents with ~0 min duration be removed from the dataset altogether?
What algorithm to use to remove outliers?
How can there be responses with 100% completion but very low duration?

SachaG commented 1 year ago

I found 131 responses with 100% completion but <5 min duration, among which 9 had a 1 min duration. So I do think there is a problem with the way completion is calculated, but it doesn't seem to be too widespread.

ShaineRosewel commented 1 year ago

I will try to recalculate completion based on the entries.

ShaineRosewel commented 1 year ago

@LeaVerou, may I know what question we specifically would like to address when checking the duration data?

SachaG commented 1 year ago

By the way this chart is wrong, the percentages don't add up to 100%. I fixed it locally and will redeploy a new version soon.

Actually, in the new version it's even more imbalanced with >70% of respondents in the 90-100% bracket. Which made me wonder if maybe we should use a different data visualization, like maybe not grouping the items and showing all 100 bars? But then the axis labels would get very messy…

ShaineRosewel commented 1 year ago

@SachaG There must be a problem with the completion variable for CSS 2023. I tried to filter the data so that it only includes those that have 100 completion value. I then looked at the number of NA's for all columns starting with 'features'. Below is what I got. This needs to be checked. (I can try to recalculate it for the info being requested by Lea.)

This is the reason for your one of your listed questions:

How can there be responses with 100% completion but very low duration?

ShaineRosewel commented 1 year ago

Is duration missing from the dataset?

It is - across css 2022, css 2023, and js 2022. But as you mentioned, can simply be calculated using (updatedAt - createdAt)

ShaineRosewel commented 1 year ago

@LeaVerou

I recalculated completion for each respondent by getting the percent of NA's on important questions (Here, I excluded questions prefixed with user_info, id's, surveySlug, etc.), then subtracting from 100.
I also excluded duration that are flagged to be an outlier for each facet, individually. To do this, for each survey: 1) take the log of the duration - I did this to prevent negative values when calculating for the lower fence in step 2; 2) use IQR to determine if a value is an outlier - any scores that are less than the lower fence or greater than the higher one are outliers; 3) exponentiate the values to remove the effect of log (@SachaG, this is the answer to What algorithm to use to remove outliers?)

Here, we can see that JS seems to take less time than both CSS surveys. If the length of the survey could be checked through the number of columns in each, it shows that JS is the longest among them (351 columns vs 270 & 260 - CSS 22, 23; resp.). @SachaG Let me know if i missed something.

LeaVerou commented 1 year ago

Thanks @ShaineRosewel, these are very comprehensive for glancing at, but could I please also have medians, means, and stdev as I want to calculate something?

ShaineRosewel commented 1 year ago

Thanks @ShaineRosewel, these are very comprehensive for glancing at, but could I please also have medians, means, and stdev as I want to calculate something?

survey	mean	median	sd
cs22	19.8	14.5	19.0
cs23	23.7	16.4	25.0
js22	19.2	15.7	13.3

Values being too spread out is due to the fact that respondents have varying completion percentage. Let me know if you are after records that have a high completion percentage.

LeaVerou commented 1 year ago

Thanks @ShaineRosewel, these are very comprehensive for glancing at, but could I please also have medians, means, and stdev as I want to calculate something?

survey mean median sd cs22 19.8 14.5 19.0 cs23 23.7 16.4 25.0 js22 19.2 15.7 13.3 Values being too spread out is due to the fact that respondents have varying completion percentage. Let me know if you are after records that have a high completion percentage.

Thanks for the fast response! Yes, I think if we narrow it down to respondents with a high completion percentage it may be better. Would that be too much hassle to calculate?

ShaineRosewel commented 1 year ago

Still large - this includes responses with at least 80% completion. This large sd is expected since we are dealing with response times. If this ain't okay for your purpose, we can actually use the log scale, so we first make sure that the data resembles a normal curve - that way, sd will become a better measure of dispersion.

survey	mean	median	sd
cs22	22.3	16.2	18.5
cs23	26.9	18.3	24.6
js22	20.7	16.7	12.8

ShaineRosewel commented 1 year ago

Let me know @LeaVerou if I can help with what you are trying to compute!

LeaVerou commented 1 year ago

Let me know @LeaVerou if I can help with what you are trying to compute!

I was trying to compute a very rough measure of the additional time needed to answer 5-answer questions compared to the 3-answer feature questions, but the data is too noisy for that and there's way too many confounds and factors that are different even across different years of the same survey.

State of CSS 2023 had 56 feature questions and 26 5-answer questions.
State of CSS 2022 had 53 feature questions and 26 5-answer questions.
State of JS 2022 had 34 feature questions and 64 5-answer questions

If we assume that the time to fill in the rest of the questions was roughly the same across both years of State of CSS, that would give us that the additional 3 feature questions cost 2-4 extra minutes (depending on whether we use medians or means), which makes no sense at all, so the rest of the survey must have been substantially different. And it's impossible to compare with State of JS, since the rest of the survey is so different. If you can think of any way to calculate this, great; personally I'm out of ideas (which doesn't happen often 😅).

ShaineRosewel commented 1 year ago

I'll try to think of a way and let you know if i have an idea. Our duration is for the entire questionnaire. I think that will be a lot easier to do if we have a per item response time.

Devographics / Monorepo

Plotting Duration & Completion #260