ProjectSidewalk / sidewalk-quality-analysis

An analysis of Project Sidewalk user quality based on interaction logs
5 stars 3 forks source link

User id 'd64b67dd-8116-429f-8592-26cbd35c3f2c' has negative meters_audited #55

Closed jonfroehlich closed 4 years ago

jonfroehlich commented 4 years ago

image

Any idea how this is possible?

misaugstad commented 4 years ago

It's a bug that I've fixed for users, would just need to update it on the back-end. I can do that tomorrow hopefully

jonfroehlich commented 4 years ago

OK! So, we'll be able to recover the actual distance for this user?

On Tue, Jun 9, 2020 at 4:05 PM Mikey Saugstad notifications@github.com wrote:

It's a bug that I've fixed for users, would just need to update it on the back-end. I can do that tomorrow hopefully

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/sidewalk-quality-analysis/issues/55#issuecomment-641626027, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAML55MA5S4EODE7TUN3LGDRV254DANCNFSM4NZ2XQ3Q .

-- Jon Froehlich Associate Professor Paul G. Allen School of Computer Science & Engineering University of Washington http://makeabilitylab.io @jonfroehlich https://twitter.com/jonfroehlich - Twitter Help make sidewalks more accessible: http://projectsidewalk.io

misaugstad commented 4 years ago

Sort of maybe. It depends on which measurement of distance we want to use. I don't think we were going to continue using this measurement of distance anyway, so not sure it totally matters. This was just the quickest way for me to get a distance measurement to you, and I'll be using a more accurate way in the future (switching from using the distance_progress column in the mission table to measuring the lengths of streets they have completely audited).

However, I just changed all the negative distances to be 0 in the databases, so you at least should no longer see any negative numbers going forward.

misaugstad commented 4 years ago

Okay we are ultimately going with this method of computing distance. It has some small issues with it (like the one we talked about in this thread). But any method of measuring distance is going to have some problems with it. And as said above, the negative numbers should be gone now. Pasting what I said in my email to you earlier today on the subject:

As for the distance measurement, I'm going to use the distance_progress column in the mission table. We did not have this number available to us in the last paper since it was before the mission rewrite that I did. There are two main differences between this and what we had done in the past. First, distances are calculated using the javascript turf library as opposed to postgres functions, though they give approximately the same result. The bigger difference is that it takes into account progress on streets that one did not finish auditing; our method from the last paper only looked at completed streets. I think that if you are looking at user performance, especially for users who have only done a small amount of auditing, this is more accurate.

I also looked into keeping the length measurement through postgres and instead incorporating partially completed streets using the current_lat and current_lng columns I added to the audit_task table that we use to let people resume in the middle of a street. The idea was to use other postgres functions to cut the street at their current point on it and only calculate the length for the portion they had done. From my initial tests with this, I didn't see a big benefit to using it over the proposed method. There are some cases where this method is more accurate and some where my proposed method is more accurate. The main issue is just that it is a lot more complicated than the other methods, and there are a lot more edge cases to deal with. Including the fact that this functionality was added later on, so we would have to compute it differently for data before this update was added. So yeah, basically as accurate, a lot more error prone with edge cases.