Split out accuracies per label type

jonfroehlich commented 5 years ago

Perhaps there is signal in how users perform on different label types. For example, someone who does poorly on curb ramps is probably not very good. We should explore this.

daotyl000 commented 5 years ago

Do I need a new csv of label correctness? The current csv contains its validation, label id, & user id but I can't see the label type unless i manually look it up.

jonfroehlich commented 5 years ago

Sounds like you do then. Ping Mikey. :)

On Wed, Jul 17, 2019 at 3:17 PM daotyl000 notifications@github.com wrote:

Do I need a new csv of label correctness? The current csv contains its validation, label id, & user id but I can't see the label type unless i manually look it up.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/sidewalk-quality-analysis/issues/26?email_source=notifications&email_token=AAML55NXO64RA3NYOPEGANTP76K7RA5CNFSM4IEIEAQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2GX4UI#issuecomment-512589393, or mute the thread https://github.com/notifications/unsubscribe-auth/AAML55JCDEGAWQGPU5OYXS3P76K7RANCNFSM4IEIEAQQ .

-- Jon Froehlich Associate Professor Paul G. Allen School of Computer Science & Engineering University of Washington http://makeabilitylab.io @jonfroehlich https://twitter.com/jonfroehlich - Twitter Help make sidewalks more accessible: http://projectsidewalk.io

daotyl000 commented 5 years ago

Blue = Good user Red = Overall Bad user Yellow = Bad user in neighborhoods w/o sidewalks Screen Shot 2019-07-18 at 3 49 40 PM

jonfroehlich commented 5 years ago

Can you summarize your findings here please? What is the y-axis?

On Thu, Jul 18, 2019 at 3:51 PM daotyl000 notifications@github.com wrote:

[image: Screen Shot 2019-07-18 at 3 49 40 PM] https://user-images.githubusercontent.com/28814007/61497285-d1573f80-a973-11e9-9c6c-04945e1f8fa7.png

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/sidewalk-quality-analysis/issues/26?email_source=notifications&email_token=AAML55JEMBKZ7VMDE4H4SK3QADXWBA5CNFSM4IEIEAQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2KBLTQ#issuecomment-513021390, or mute the thread https://github.com/notifications/unsubscribe-auth/AAML55IVSH73JUHLWS755ODQADXWBANCNFSM4IEIEAQQ .

-- Jon Froehlich Associate Professor Paul G. Allen School of Computer Science & Engineering University of Washington http://makeabilitylab.io @jonfroehlich https://twitter.com/jonfroehlich - Twitter Help make sidewalks more accessible: http://projectsidewalk.io

daotyl000 commented 5 years ago

The y-axis is the user's overall accuracy accounting for all labels. The x-axis represents the user's accuracy for the specific label type w/ 1.0 being 100% accuracy

Almost all users have a high curb ramp accuracy so that label type (only one person was below 75% on curb ramps and they have about a 55% overall accuracy

I'm not sure how to interpret the no curb ramp label because there are users of all overall accuracies at different label accuracy placements.

User's overall accuracy is heavily correlated with their obstacle accuracy and their surface problem accuracy. This is what we expected because they're harder labels to get so the user's ability to properly identify them appears to be a good indicator of their overall skill.

jonfroehlich commented 5 years ago

@daotyl000 and I talked about this in person. We cannot do a correlative analysis where we reuse data for both the x and y axis (as we do here). In other words, we cannot graph a subset of the y-axis data on the x-axis. Instead, you should remove the particular label type in the y-axis 'overall calculation'

Does this make sense @daotyl000?

ProjectSidewalk / sidewalk-quality-analysis

Split out accuracies per label type #26