ProjectSidewalk / sidewalk-quality-analysis

An analysis of Project Sidewalk user quality based on interaction logs
5 stars 3 forks source link

Need more validations for some users #57

Open jonfroehlich opened 4 years ago

jonfroehlich commented 4 years ago

I just posted this to the main Sidewalk repo: https://github.com/ProjectSidewalk/SidewalkWebpage/issues/2139 but, in short, there is a significant number of users who completed one mission but have yet to receive 10 validations.

Hmm, I need to check how many of those users actually supplied 10 labels... let me check on that.

UPDATE: OK, this isn't as bad as I thought because 151 of these 194 (78%) never even supplied 10 labels—so they wouldn't be able to reach the min num of validation threshold (there aren't even 10 labels to validate!). But still, this means that 41 users have yet to receive the minimum number of validations... not sure how many of these are 'no sidewalk.'

178/898 users (19.8%) never completed even one mission. Removing them from analysis.
151/627 users (24.1%) have not provided the minimum number of labels of 10. Removing them from analysis.
41/428 users (9.6%) have not received the min num of validations of 10. Removing them from analysis.
We filtered out 525/898 users (58.46%)
373 total users remain across the 4 cities
jonfroehlich commented 2 years ago

The updated stats on this are:

0/1512 users (0.0%) have not provided the min num of labels of 10. Removing them from analysis. 0/1512 users (0.0%) have not received the min num of validations of 10. Removing them from analysis.

@misaugstad, do you prefilter users out who have not yet received the minimum number of 10 validations in your data dumps to me? I think I'd prefer to do that filtering on my end so we can see how many more validations we need...

misaugstad commented 2 years ago

Agreed. There's too much code being written in SQL that should be happening in Python. I can remove the filtering from my end and generate new datasets for you today if you'd like.

misaugstad commented 2 years ago

I'm about to upload new CSVs with the min validations requirement removed. I'll let you close this issue once you've made any changes that you want to make on the Python notebook side.