ProjectSidewalk / sidewalk-quality-analysis

An analysis of Project Sidewalk user quality based on interaction logs
5 stars 3 forks source link

Analyze user accuracy vs frequency of going in reverse #35

Closed daotyl000 closed 4 years ago

daotyl000 commented 5 years ago

Is there a correlation between accuracy and a user's tendency to go in reverse? Do users who go back and review their previous panos do better in catching mistakes or missed labels? This could be tracked using the PanoId_Changed action with the number of times pano ids show up multiple times.

jonfroehlich commented 5 years ago

Great idea!

On Thu, Aug 1, 2019 at 10:19 AM daotyl000 notifications@github.com wrote:

Is there a correlation between accuracy and a user's tendency to go in reverse? Do users who go back and review their previous panos do better in catching mistakes or missed labels? This could be tracked using the PanoId_Changed action with the number of times pano ids show up multiple times.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/sidewalk-quality-analysis/issues/35?email_source=notifications&email_token=AAML55N4TQBDNFXXNZ5XSCLQCMLKHA5CNFSM4IISZS6KYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HC4DUBA, or mute the thread https://github.com/notifications/unsubscribe-auth/AAML55KJAUEQXGPXKTZOYN3QCMLKHANCNFSM4IISZS6A .

-- Jon Froehlich Associate Professor Paul G. Allen School of Computer Science & Engineering University of Washington http://makeabilitylab.io @jonfroehlich https://twitter.com/jonfroehlich - Twitter Help make sidewalks more accessible: http://projectsidewalk.io

daotyl000 commented 5 years ago

I quantified panos revisited by the differences of the number of unique pano ids recorded and the number of pano changes. All users (except for 1) were good if they had revisited atleast 500 panos. The good users almost split in half (9 low revisit, 10 high revisit) while bad users had (9 low revisits, 1 high revisit). This appears to support that users who go back more and revisit panos will have a higher accuracy due to fixing mistakes or labeling missed things.

Screen Shot 2019-08-01 at 3 38 10 PM

The 0.12 is the corrolation coefficient while the 0.53 is the significance value

jonfroehlich commented 5 years ago

Can you add in the correlation coefficient on the graph and in text for any and all correlation analyses?

On Thu, Aug 1, 2019 at 3:03 PM daotyl000 notifications@github.com wrote:

I quantified panos revisited by the differences of the number of unique pano ids recorded and the number of pano changes. All users (except for 1) were good if they had revisited atleast 500 panos. The good users almost split in half (9 low revisit, 10 high revisit) while bad users had (9 low revisits, 1 high revisit). This appears to support that users who go back more and revisit panos will have a higher accuracy due to fixing mistakes or labeling missed things.

[image: Screen Shot 2019-08-01 at 3 00 08 PM] https://user-images.githubusercontent.com/28814007/62330340-3c713d80-b46d-11e9-8d3c-3281d949f15b.png

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/sidewalk-quality-analysis/issues/35?email_source=notifications&email_token=AAML55KLF4HIXCD2HEOAF5TQCNMTDA5CNFSM4IISZS6KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3MARFY#issuecomment-517474455, or mute the thread https://github.com/notifications/unsubscribe-auth/AAML55KZI65IUPSO3EY4MTDQCNMTDANCNFSM4IISZS6A .

-- Jon Froehlich Associate Professor Paul G. Allen School of Computer Science & Engineering University of Washington http://makeabilitylab.io @jonfroehlich https://twitter.com/jonfroehlich - Twitter Help make sidewalks more accessible: http://projectsidewalk.io

jonfroehlich commented 5 years ago

Also, sometimes panos are revisted because of our routing algorithm (e.g., you cross through an intersection twice--once from North-South and once from East-West) rather than by reversing directions. So, you probably want to differentiate those cases.

daotyl000 commented 5 years ago

Would checking the timestamp and seeing if panos were revisited within a certain amount of time (exp. 30 seconds) be a better method of checking of determining pano revisiting?

jonfroehlich commented 5 years ago

I think revisiting panos within the same mission would be best.

On Thu, Aug 1, 2019 at 3:28 PM daotyl000 notifications@github.com wrote:

Would checking the timestamp and seeing if panos were revisited within a certain amount of time (exp. 30 seconds) be a better method of checking of determining pano revisiting?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/sidewalk-quality-analysis/issues/35?email_source=notifications&email_token=AAML55PUIQNV4TWQ4AHTLBTQCNPRDA5CNFSM4IISZS6KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3MCBGI#issuecomment-517480601, or mute the thread https://github.com/notifications/unsubscribe-auth/AAML55PPVN3ETQOYN3Z6RZTQCNPRDANCNFSM4IISZS6A .

-- Jon Froehlich Associate Professor Paul G. Allen School of Computer Science & Engineering University of Washington http://makeabilitylab.io @jonfroehlich https://twitter.com/jonfroehlich - Twitter Help make sidewalks more accessible: http://projectsidewalk.io

daotyl000 commented 5 years ago

I changed how the way I was calculating revisiting panos to pano ids reappearing in the same mission. The points shifted and their X values decreased but the shape largely stayed the same. The correlation coefficient decreased from 0.12 to 0.07.

Screen Shot 2019-08-02 at 9 40 20 AM