Devographics / Monorepo

Monorepo containing the State of JS apps
surveyform-sigma.vercel.app
Other
124 stars 50 forks source link

Plotting Duration & Completion #260

Open SachaG opened 1 year ago

SachaG commented 1 year ago

Charts by @ShaineRosewel

unnamed unnamed (1) unnamed (2)

SachaG commented 1 year ago

Things to look into:

SachaG commented 1 year ago

I found 131 responses with 100% completion but <5 min duration, among which 9 had a 1 min duration. So I do think there is a problem with the way completion is calculated, but it doesn't seem to be too widespread.

ShaineRosewel commented 1 year ago

I will try to recalculate completion based on the entries.

ShaineRosewel commented 1 year ago

@LeaVerou, may I know what question we specifically would like to address when checking the duration data?

SachaG commented 1 year ago
Screenshot 2023-08-15 at 15 38 27

By the way this chart is wrong, the percentages don't add up to 100%. I fixed it locally and will redeploy a new version soon.

Actually, in the new version it's even more imbalanced with >70% of respondents in the 90-100% bracket. Which made me wonder if maybe we should use a different data visualization, like maybe not grouping the items and showing all 100 bars? But then the axis labels would get very messy…

ShaineRosewel commented 1 year ago

@SachaG There must be a problem with the completion variable for CSS 2023. I tried to filter the data so that it only includes those that have 100 completion value. I then looked at the number of NA's for all columns starting with 'features'. Below is what I got. This needs to be checked. (I can try to recalculate it for the info being requested by Lea.)

This is the reason for your one of your listed questions:

How can there be responses with 100% completion but very low duration?

Screenshot 2023-08-15 at 4 59 19 PM
ShaineRosewel commented 1 year ago
  • Is duration missing from the dataset?

It is - across css 2022, css 2023, and js 2022. But as you mentioned, can simply be calculated using (updatedAt - createdAt)

ShaineRosewel commented 1 year ago

@LeaVerou

Here, we can see that JS seems to take less time than both CSS surveys. If the length of the survey could be checked through the number of columns in each, it shows that JS is the longest among them (351 columns vs 270 & 260 - CSS 22, 23; resp.). @SachaG Let me know if i missed something.

Screenshot 2023-08-15 at 9 13 29 PM
LeaVerou commented 1 year ago

Thanks @ShaineRosewel, these are very comprehensive for glancing at, but could I please also have medians, means, and stdev as I want to calculate something?

ShaineRosewel commented 1 year ago

Thanks @ShaineRosewel, these are very comprehensive for glancing at, but could I please also have medians, means, and stdev as I want to calculate something?

survey mean median sd
cs22 19.8 14.5 19.0
cs23 23.7 16.4 25.0
js22 19.2 15.7 13.3

Values being too spread out is due to the fact that respondents have varying completion percentage. Let me know if you are after records that have a high completion percentage.

LeaVerou commented 1 year ago

Thanks @ShaineRosewel, these are very comprehensive for glancing at, but could I please also have medians, means, and stdev as I want to calculate something?

survey mean median sd cs22 19.8 14.5 19.0 cs23 23.7 16.4 25.0 js22 19.2 15.7 13.3 Values being too spread out is due to the fact that respondents have varying completion percentage. Let me know if you are after records that have a high completion percentage.

Thanks for the fast response! Yes, I think if we narrow it down to respondents with a high completion percentage it may be better. Would that be too much hassle to calculate?

ShaineRosewel commented 1 year ago

Still large - this includes responses with at least 80% completion. This large sd is expected since we are dealing with response times. If this ain't okay for your purpose, we can actually use the log scale, so we first make sure that the data resembles a normal curve - that way, sd will become a better measure of dispersion.

survey mean median sd
cs22 22.3 16.2 18.5
cs23 26.9 18.3 24.6
js22 20.7 16.7 12.8
ShaineRosewel commented 1 year ago

Let me know @LeaVerou if I can help with what you are trying to compute!

LeaVerou commented 1 year ago

Let me know @LeaVerou if I can help with what you are trying to compute!

I was trying to compute a very rough measure of the additional time needed to answer 5-answer questions compared to the 3-answer feature questions, but the data is too noisy for that and there's way too many confounds and factors that are different even across different years of the same survey.

If we assume that the time to fill in the rest of the questions was roughly the same across both years of State of CSS, that would give us that the additional 3 feature questions cost 2-4 extra minutes (depending on whether we use medians or means), which makes no sense at all, so the rest of the survey must have been substantially different. And it's impossible to compare with State of JS, since the rest of the survey is so different. If you can think of any way to calculate this, great; personally I'm out of ideas (which doesn't happen often 😅).

ShaineRosewel commented 1 year ago

I'll try to think of a way and let you know if i have an idea. Our duration is for the entire questionnaire. I think that will be a lot easier to do if we have a per item response time.