ProjectSidewalk / SidewalkWebpage

Project Sidewalk web page
http://projectsidewalk.org
MIT License
84 stars 24 forks source link

Something Going on With Anonymous Users? #791

Closed jonfroehlich closed 7 years ago

jonfroehlich commented 7 years ago

@misaugstad pointed out to me tonight that there is something strange going on with the anonymous user contributions. Notice, for example, that the top ip address audited 15 street segments but contributed 4,381 labels. This is insane!

image

So, we need to investigate. Could it be:

Who wants to investigate? @r-holland @sbower213 @adash12?

manaswisaha commented 7 years ago

@r-holland @sbower213 @adash12 Is anyone looking into this?

Since @sbower213 is now busy on a different issue, can one of you look into it?

chishankar commented 7 years ago

Yeah I can look into it

chishankar commented 7 years ago

From looking at it, my initial hypothesis is that the label count is just continuously adding the users label count to the overall total.

For example: 73.163.171.105 did 137 Audits and created 651 labels, but it added it to the previous total of 2020 and outputted 2671 as the total Label Count.

I will look further into this to confirm my hypothesis and fix this.

jonfroehlich commented 7 years ago

@chishankar your explanation doesn't make sense to me. :P

chishankar commented 7 years ago

I believe the user's total label count gets added to the overall total label count and that is why there is an inflation of numbers. I am looking into to it and will post a more comprehensive explanation once I know for sure what it is

misaugstad commented 7 years ago

After he talked to me in person, I think what he is saying is that... If you look take the bottom user in that table, it says they have 1796 labels. But if you go up to the user that is 2nd from the bottom, they have 1886 labels with 2 audits. He is saying that this user actually has 1886-1796=90 labels in the 2 audits. So the number of labels is cumulative, going up the table.

I think that he will soon discover that this is certainly not how this is calculated :)

misaugstad commented 7 years ago

Oh, and the above table is in the admin dashboard, in the "Users" tab, the 2nd table (the one of anonymous users). I then sorted by label count.

sbower213 commented 7 years ago

I think I figured it out. One of the sql subqueries was pulling a lot of duplicates, and when the number of labels from each audit were added, it would keep adding duplicates. I added a "select distinct" into the query and think I've fixed it. I'll do some more cross referencing of the table to make sure I didn't omit anything, but this is what the new table looks like: image

jonfroehlich commented 7 years ago

sweet. thanks. what's that weird 0:0:0:...1 ip address?

On Fri, Jun 30, 2017 at 3:40 PM, Steven Bower notifications@github.com wrote:

I think I figured it out. One of the sql subqueries was pulling a lot of duplicates, and when the number of labels from each audit were added, it would keep adding duplicates. I added a "select distinct" into the query and think I've fixed it. I'll do some more cross referencing of the table to make sure I didn't omit anything, but this is what the new table looks like: [image: image] https://user-images.githubusercontent.com/8651110/27751470-748a63de-5daa-11e7-957a-e8f63c2a4e31.png

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/SidewalkWebpage/issues/791#issuecomment-312356463, or mute the thread https://github.com/notifications/unsubscribe-auth/ABi-9R6FT6pAZDLraVhELcxHWpj6tlP1ks5sJU88gaJpZM4OKDvn .

-- Jon Froehlich Assistant Professor Computer Science University of Maryland, College Park http://www.cs.umd.edu/~jonf/ @jonfroehlich https://twitter.com/jonfroehlich - Twitter

sbower213 commented 7 years ago

anything from localhost gets logged like that, according to stackoverflow

jonfroehlich commented 7 years ago

figured as much. thanks for confirming.

On Fri, Jun 30, 2017 at 4:22 PM, Steven Bower notifications@github.com wrote:

anything from localhost gets logged like that, according to stackoverflow https://stackoverflow.com/questions/10386875/host-ip-address-00000001-on-servlet

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/SidewalkWebpage/issues/791#issuecomment-312364524, or mute the thread https://github.com/notifications/unsubscribe-auth/ABi-9Xk6RBbvtVDuF3PGytjtgtlN78Vlks5sJVj2gaJpZM4OKDvn .

-- Jon Froehlich Assistant Professor Computer Science University of Maryland, College Park http://www.cs.umd.edu/~jonf/ @jonfroehlich https://twitter.com/jonfroehlich - Twitter

misaugstad commented 7 years ago

Resolved via #799