Open jonfroehlich opened 4 years ago
Here is the arcgis map of the labels of all users with an accuracy lower than 65%. The "population" scale on the side is really the density. It is measured in popuplation per square mile. The industrial district does not have any population density data which I assume means no one lives in that region of Seattle. Generally, it appears as the lower the population density, the lower the accuracy which I interperate as there are less sidewalks due less people living there.
Can you do a quantitative analysis rather than a qualitative analysis?
On Thu, Aug 1, 2019 at 3:14 PM daotyl000 notifications@github.com wrote:
Here is the arcgis map of the labels of all users with an accuracy lower than 65%. The "population" scale on the side is really the density. It is measured in popuplation per square mile. The industrial district does not have any population density data which I assume means no one lives in that region of Seattle. Generally, it appears as the lower the population density, the lower the accuracy which I interperate as there are less sidewalks due less people living there.
[image: Screen Shot 2019-08-01 at 2 55 00 PM] https://user-images.githubusercontent.com/28814007/62330575-e781f700-b46d-11e9-949b-0b207297e329.png [image: Screen Shot 2019-08-01 at 2 55 12 PM] https://user-images.githubusercontent.com/28814007/62330576-e81a8d80-b46d-11e9-8ca6-29ce6ca06345.png
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/sidewalk-quality-analysis/issues/33?email_source=notifications&email_token=AAML55KGA2DP5NJBNO5SZ4DQCNN2VA5CNFSM4IILNSO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3MBHLY#issuecomment-517477295, or mute the thread https://github.com/notifications/unsubscribe-auth/AAML55LZ7TKITQW4E6LXNV3QCNN2VANCNFSM4IILNSOQ .
-- Jon Froehlich Associate Professor Paul G. Allen School of Computer Science & Engineering University of Washington http://makeabilitylab.io @jonfroehlich https://twitter.com/jonfroehlich - Twitter Help make sidewalks more accessible: http://projectsidewalk.io
Here is my analysis: I estimatee the percentages by zooming into each individual neighborhoods. I changed the color scale to be within 4 distinct color groups instead of gradients. The bottom three groups have a difference of about 5000 people/square mile.
My analysis shows that was the population density increases, so does user accuracy
The white region consists of only the Industrial District and is mainly incorrect labels
Tannish Region: Rainier Beach: about 80% correct Harbor Island : 50% in correctness Fauntleroy: 50% in correctness South Park: 50% correctness Briar Cliff 50% correctness Mid-Beacon Hill: 20% correct South Delridge: 10% correct
Neighborhoods: 7
Average: 44.29%
Light Blue Region: Pinehurst 90 % correct Gatewood 75% correct Sunset Hill 75% correct West Queen Anne 75% correct Harrison/Denny-Blaine 70% correct North Beach/ Blue Ridge 50% correct Wedgewood 30% correct Alki: 10% correct
Neighborhoods: 8
Average: 59.375%
Blue Region: Loyal Heights 90% correct Roxhill 90% correct East Queen Anne 90% correct University District 70% correct North Queen Anne 70% correct South Lake Union 60% correct Maple Leaf: 50% correct Fremond: 50% correct Olympic Hills: 20% correct
Neightborhoods: 9
Average: 65.56 %
Dark Blue Region: Whittier Heights 90% correct Mann 90% correct Broadway 80% correct
Neighborhoods: 3
Average: 86.67%
Thanks. I originally intended for you to do this programmatically. If we think that there is some merit to this, then you could programmatically calculate population density as a predictor of user accuracy. Before doing so, perhaps you should think about the best way to do this and then propose a plan.
On Thu, Aug 1, 2019 at 4:39 PM daotyl000 notifications@github.com wrote:
Here is my analysis: I estimatee the percentages by zooming into each individual neighborhoods. I changed the color scale to be within 4 distinct color groups instead of gradients. The bottom three groups have a difference of about 5000 people/square mile.
[image: Screen Shot 2019-08-01 at 4 32 04 PM] https://user-images.githubusercontent.com/28814007/62333812-ece53e80-b479-11e9-95c1-71a00a9ef5c8.png
My analysis shows that was the population density increases, so does user accuracy
The white region consists of only the Industrial District and is mainly incorrect labels
Tannish Region: Rainier Beach: about 80% correct Harbor Island : 50% in correctness Fauntleroy: 50% in correctness South Park: 50% correctness Briar Cliff 50% correctness Mid-Beacon Hill: 20% correct South Delridge: 10% correct
Neighborhoods: 7 Average: 44.29%
Light Blue Region: Pinehurst 90 % correct Gatewood 75% correct Sunset Hill 75% correct West Queen Anne 75% correct Harrison/Denny-Blaine 70% correct North Beach/ Blue Ridge 50% correct Wedgewood 30% correct Alki: 10% correct
Neighborhoods: 8 Average: 59.375%
Blue Region: Loyal Heights 90% correct Roxhill 90% correct East Queen Anne 90% correct University District 70% correct North Queen Anne 70% correct South Lake Union 60% correct Maple Leaf: 50% correct Fremond: 50% correct Olympic Hills: 20% correct
Neightborhoods: 9 Average: 65.56 %
Dark Blue Region: Whittier Heights 90% correct Mann 90% correct Broadway 80% correct
Neighborhoods: 3 Average: 86.67%
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/sidewalk-quality-analysis/issues/33?email_source=notifications&email_token=AAML55IYKWRF3NZICPVJNNLQCNX2NA5CNFSM4IILNSO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3MFRXI#issuecomment-517495005, or mute the thread https://github.com/notifications/unsubscribe-auth/AAML55J5RESNOQJ3CCABQXTQCNX2NANCNFSM4IILNSOQ .
-- Jon Froehlich Associate Professor Paul G. Allen School of Computer Science & Engineering University of Washington http://makeabilitylab.io @jonfroehlich https://twitter.com/jonfroehlich - Twitter Help make sidewalks more accessible: http://projectsidewalk.io
Here are some plots of population density vs. label accuracy. In general, the higher the population density, the higher the accuracy.
I'm not sure how viable the obstacle vs population graph is because it looks like that one outlier is the only reason the trend line isn't straight.
We can get population density information for Seattle from Raymond Fok (or, at least, he can point us in the right direction).
This relates to https://github.com/ProjectSidewalk/sidewalk-quality-analysis/issues/22