ProjectSidewalk / sidewalk-quality-analysis

An analysis of Project Sidewalk user quality based on interaction logs
5 stars 3 forks source link

Analyze how often users look on both sides of the road as an indicator of user accuracy #36

Open daotyl000 opened 5 years ago

daotyl000 commented 5 years ago

How much does a user's accuracy change by how much they look around horizontally, and whether or not they look at both sides of the road? Heading records the position the user is looking horizontally, the value itself isn't helpful but we can use the range. Would we want to have the full 360 degrees or would a lower degrees be sufficient.

daotyl000 commented 5 years ago

This graph shows the average heading range of each users Blue = Good users Black = Bad users Red = Users in neighborhoods without sidewalks

Screen Shot 2019-08-01 at 10 56 57 AM

All users (except for one bad users) who on average looked at more than about 140 degrees of the pano were good. This indicates that users who generally look more of the pano will have a higher accuracy. However, I am not sure what degrees range average would be sufficient, only one user had an averge over 225. While there are alot of good users who also had low ranges, the split where almost no bad users had a heading range of over about 120 degree.

daotyl000 commented 5 years ago

When looking at number of times where a users looked at atleast 350 degrees of a pano, we see a similar trend where almost all users who had more than 300 panos where they viewed most of it were good users. I used 350 degrees because it is most of the pano and when I had used 360 degrees, all users collectively only totaled about 6 panos

Screen Shot 2019-08-01 at 11 18 11 AM

jonfroehlich commented 5 years ago

Can you clearly define your x-axis. In other words: What is 'average heading' and how is it calculated? Similarly, what is 'number of 350+ degree views' and how is it calculated?

Also, can you start to apply a best fit linear regression line and report the fit value?

daotyl000 commented 5 years ago

Average heading of on average, how many degrees does a user look around a pano. Heading is a value that is attatched to each action, I believe 0 is directly north.

number of 350+ degrees views means, the number of panos where the user had panned around to view atleast 350 degrees of the pano. This is calculated by substracting the lowest heading value by the highest heading value

What do you mean by the fit value? Does that mean the slope?

jonfroehlich commented 5 years ago

Need correlation coefficient labels on regression line and in text.

On Thu, Aug 1, 2019 at 2:57 PM daotyl000 notifications@github.com wrote:

Average heading of on average, how many degrees does a user look around a pano. Heading is a value that is attatched to each action, I believe 0 is directly north.

number of 350+ degrees views means, the number of panos where the user had panned around to view atleast 350 degrees of the pano. This is calculated by substracting the lowest heading value by the highest heading value [image: Screen Shot 2019-08-01 at 2 56 48 PM] https://user-images.githubusercontent.com/28814007/62330154-afc67f80-b46c-11e9-9004-99fbc771a918.png [image: Screen Shot 2019-08-01 at 2 56 52 PM] https://user-images.githubusercontent.com/28814007/62330155-afc67f80-b46c-11e9-8319-dfe0d6fe4b8e.png

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/sidewalk-quality-analysis/issues/36?email_source=notifications&email_token=AAML55OM4ST5TY5XWCEP3RDQCNL5RA5CNFSM4IIS7RTKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3MAEPY#issuecomment-517472831, or mute the thread https://github.com/notifications/unsubscribe-auth/AAML55PEYDKPA5MVPDMJ2ELQCNL5RANCNFSM4IIS7RTA .

-- Jon Froehlich Associate Professor Paul G. Allen School of Computer Science & Engineering University of Washington http://makeabilitylab.io @jonfroehlich https://twitter.com/jonfroehlich - Twitter Help make sidewalks more accessible: http://projectsidewalk.io

daotyl000 commented 5 years ago

Here is the graph calculating the average range of degrees that the user looks at per pano:

Screen Shot 2019-08-01 at 3 45 57 PM

The correlation coefficient and significance are both 0.23

Here is the graph counting the number of times a user looks at the majority of a pano:

Screen Shot 2019-08-01 at 3 44 32 PM The correlation coefficient is 0.4 while the significance value is 0.03

jonfroehlich commented 5 years ago

What is the significance value? What does it mean? How is it calculated?

On Thu, Aug 1, 2019 at 3:47 PM daotyl000 notifications@github.com wrote:

Here is the graph calculating the average range of degrees that the user looks at per pano:

[image: Screen Shot 2019-08-01 at 3 45 57 PM] https://user-images.githubusercontent.com/28814007/62332249-9aa11f00-b473-11e9-9e90-389a73ff6a95.png

The correlation coefficient and significance are both 0.23

Here is the graph counting the number of times a user looks at the majority of a pano:

[image: Screen Shot 2019-08-01 at 3 44 32 PM] https://user-images.githubusercontent.com/28814007/62332176-5150cf80-b473-11e9-8ac1-1a59abaa90d4.png The correlation coefficient is 0.4 while the significance value is 0.03

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/sidewalk-quality-analysis/issues/36?email_source=notifications&email_token=AAML55JVP46PPE2IQMPQZNLQCNRXHA5CNFSM4IIS7RTKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3MDBAQ#issuecomment-517484674, or mute the thread https://github.com/notifications/unsubscribe-auth/AAML55PWXLAKW7JT7XZFFCTQCNRXHANCNFSM4IIS7RTA .

-- Jon Froehlich Associate Professor Paul G. Allen School of Computer Science & Engineering University of Washington http://makeabilitylab.io @jonfroehlich https://twitter.com/jonfroehlich - Twitter Help make sidewalks more accessible: http://projectsidewalk.io

daotyl000 commented 5 years ago

According to the scipy website (p-value = significance) : "The p-value roughly indicates the probability of an uncorrelated system producing datasets that have a Pearson correlation at least as extreme as the one computed from these datasets. The p-values are not entirely reliable but are probably reasonable for datasets lager than 500 or so." I wasn't sure if this is what you had wanted.

jonfroehlich commented 5 years ago

Do we have a dataset of 500 or larger? Do you think this significant value holds relevance to us?

On Thu, Aug 1, 2019 at 4:36 PM daotyl000 notifications@github.com wrote:

According to the scipy website (p-value = significance) : "The p-value roughly indicates the probability of an uncorrelated system producing datasets that have a Pearson correlation at least as extreme as the one computed from these datasets. The p-values are not entirely reliable but are probably reasonable for datasets lager than 500 or so." I wasn't sure if this is what you had wanted.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/sidewalk-quality-analysis/issues/36?email_source=notifications&email_token=AAML55NUXURIYONXXICXW33QCNXP3A5CNFSM4IIS7RTKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3MFNUA#issuecomment-517494480, or mute the thread https://github.com/notifications/unsubscribe-auth/AAML55NM53XPFUD3MY5Q6MTQCNXP3ANCNFSM4IIS7RTA .

-- Jon Froehlich Associate Professor Paul G. Allen School of Computer Science & Engineering University of Washington http://makeabilitylab.io @jonfroehlich https://twitter.com/jonfroehlich - Twitter Help make sidewalks more accessible: http://projectsidewalk.io

daotyl000 commented 5 years ago

No so it can be ignored. The command to display the correlation value also displayed that value so I left it in.

jonfroehlich commented 5 years ago

Ok... can you format the label R=value (I think this is the standard way of doing it).

Sent from my iPhone

On Aug 1, 2019, at 4:54 PM, daotyl000 notifications@github.com wrote:

No so I guess it can be ignored. The command to display the correlation value also displayed that value so I left it in.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

daotyl000 commented 5 years ago

Like this?

Screen Shot 2019-08-02 at 9 46 32 AM Screen Shot 2019-08-02 at 10 01 41 AM

jonfroehlich commented 5 years ago

yep!

On Fri, Aug 2, 2019 at 10:02 AM daotyl000 notifications@github.com wrote:

Like this?

[image: Screen Shot 2019-08-02 at 9 46 32 AM] https://user-images.githubusercontent.com/28814007/62386395-92e18900-b50c-11e9-91c9-98b07415815d.png [image: Screen Shot 2019-08-02 at 10 01 41 AM] https://user-images.githubusercontent.com/28814007/62386397-92e18900-b50c-11e9-9327-130bee3fdeb8.png

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/sidewalk-quality-analysis/issues/36?email_source=notifications&email_token=AAML55JUHWXMVHK5GLIBMVTQCRSBZA5CNFSM4IIS7RTKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3OJ6TY#issuecomment-517775183, or mute the thread https://github.com/notifications/unsubscribe-auth/AAML55K4RQA6BVFITYPJQ3DQCRSBZANCNFSM4IIS7RTA .

-- Jon Froehlich Associate Professor Paul G. Allen School of Computer Science & Engineering University of Washington http://makeabilitylab.io @jonfroehlich https://twitter.com/jonfroehlich - Twitter Help make sidewalks more accessible: http://projectsidewalk.io