streetcomplete / StreetComplete

Easy to use OpenStreetMap editor for Android
https://streetcomplete.app
GNU General Public License v3.0
3.83k stars 348 forks source link

Resurveing: Add double check when values change #5883

Open tordans opened 1 week ago

tordans commented 1 week ago

I really like the re-surveing feature that SC introduced and spearheads. But I wonder if things can be done to reduce miss-tagging in some use cases.

Use case

I notice a few times that the bicycle stands capacity resurveing quest introduces wrong data.

An example is https://osmcha.org/changesets/156286947/?aoi=3ee95214-1a8e-4a7e-8546-5c9c0ac2b006

But I noticed other times before that.

In my experience this quest is more likely to be miss-tagged during resurvey than others, but I cannot pinpoint this or think that it is relevant.

Proposed Solution

Last time I checked SC does not show the current data anywhere during the re-surveing.

My thinking is, that this should change in order to trigger a "who is wrong, me or the prev. mapper?" though in the current user, which will certainly improve the accuracy of the data.

I think this can be added in multiple ways which all have the dis/advantags. They are all OK IMO.

Personally I like the last case the most, because it does not change the current flow.

RubenKelevra commented 1 week ago

I didn't know SC had even a resurvey for capacity of bicycle stands.

Last time I checked SC does not show the current data anywhere during the re-surveing.

Ususally it does: It shows the current mapped details and asks if it's still correct. If you say no, you unlock that you map something new, otherwise a check_date is added, to map that it was reverified.

Here's an example for bicycle lanes:

Screenshot_2024-09-08-14-28-54-278-edit_de westnordost streetcomplete expert

Since I've never seen a quest to resurvey capacity of bicycle parking, so I can't be sure. But if it doesn't display what's mapped there, this should be changed and be like above.

matkoniecz commented 1 week ago

It does not display currently tagged info

RubenKelevra commented 1 week ago

I agree, it should do this beforehand.

mnalis commented 1 week ago

I can't find original post now, but AFAIR it was explained that it is intentional behaviour that bicycle parking count is not shown beforehand.

Reasoning was along the lines that for the resurvey to make any sense, the user must count the parking spots again. If the number was shown beforehand, that is very high likelyhood that many users (out of laziness & trusting the static universe model) would just confirm what was already there instead of counting, thus rendering resurvey quest worthless at best (or even more, actively problematic, as it would update old number with new check_date claims).

Thus, the first two suggestions should not be implemented, but something along the lines of a third one might:

show a confirmation after the new answer "this will change the value from A to B, does that look right? USE MINE | KEEP OLD | …(back)"

As suggested, it might still be problematic (users could just enter 0 and then choose KEEP OLD), even is somewhat less so (it would be more trying to actively game the system instead of just plain laziness/trustfulness).

But a variation that verifies old value to new one, and if they differ asks (without mentioning the old number) "It seems that the number of parking spaces here has changed, please double check your count. I DOUBLECHECKED, CONFIRM NEW COUNT | OOPS, I MISCOUNTED, LET ME REVISE THE NUMBER"

RubenKelevra commented 1 week ago

I think the default assumption that our users are lazy is odd. I mean how long does it count to 4 or to 20? That's a not a task where I think someone would skip this step.

This however may very well be a concern on parking lots for cars, where the number is say 254 or something like that.

In this case it's fine IMHO to show the number and ask for confirmation.

matkoniecz commented 1 week ago

From what I remember it was on pile of "big implementation effort given benefits, write PR if you want".

mnalis commented 1 week ago

TL;DR: see double-blind scientific methodology for reasoning why showing data first is bad idea for accuracy, and for counting task it does not gain us any convenience (as opposed to say opening_hours quest where it gives us a lot). Asking for confirmation after counting if results differ is fine, though (and probably a good idea).


I think the default assumption that our users are lazy is odd

I was trying to be short and simple (given my tendency to get overly long), but the whole known psychological issue is much more complex, and (beside "plain laziness", which is in itself probably quite complex) is resulting from observer bias, confirmation bias, and other factors; which is why in science such _blinding_ is considered essential in order not to get biased/skewed/wrong data (recent standards requiring at least double blinding, and sometimes triple blinding)

Anyway, long story short, it is known and proven psychological deficiency of human brain. (it is probably intentional evolutionary tendency though: over-optimizing in order to save time and energy)

Similar issues have been discussed previously (i.e. ideas for applying same "building type" to answer to multiple buildings is problematic not only for UI and implementation standpoint, but it also would invite not checking each building type but answering "surely all of them must be detached too" after one sees a dozens of them in a row. Or maxspeed quest being disabled by default, because checking it (especially whole areas for slow zones) is very demanding and thus likely to produce incorrect answers as most people would use "default" reasoning)


Thus, verifying the data matching only after it has already been counted (thus, without bias!) and entered is significantly better idea which produces more quality data for same amount of work.

e.g. "It seems that the number of parking spaces here has changed, please double check your count. I DOUBLECHECKED, CONFIRM NEW COUNT | OOPS, I MISCOUNTED, LET ME REVISE THE NUMBER"

tordans commented 1 week ago

Thanks for the context.

I was thinking about why some users make this mistake. When we assume the users know what they are doing (AKA how to count capacity ec) and are careful with their mapping, which I think we can, that leaves mistakes due to not understanding which object they are mapping or how the objects relate.

In Berlin, we have a very high level of details on the sidewalks and roads which means I needs to orient myself very well to match the digital map with what I see in the real word. SC map style is not ideal for this kind of situation (for good or at least understandable reasons).

There are a few factors that make the bicycle parking quest different from others…

There are a few open tickets that would help here, like #56 and more details on the map which might be possible once the map design becomes easier to modify with maplibre.

For now, the easiest way to make it easier for users to understand which objects they are changing to see the data that is already there. Be it before or after they did their re-survey…

matkoniecz commented 6 days ago

see double-blind scientific methodology for reasoning why showing data first is bad idea for accuracy,

I expect it depends on ratio of wrong answers when capacity is not shown and how often people are incorrectly swayed by shown value.

I expect that it differs in various cases.

Do you have links to this research? Has it been replicated? (Any not replicated psychology research should be considered as not worth much if anything, see replication crisis)

and for counting task it does not gain us any convenience

Maybe a tiny bit as often you will not need keyboard

matkoniecz commented 6 days ago

I was thinking about why some users make this mistake

Do you have example mistakes? I seen one user who counted U-shaped stands interest of counting 2 capacity for each (and there was space on both sides), despite the hint

tordans commented 6 days ago

I was thinking about why some users make this mistake

Do you have example mistakes? I seen one user who counted U-shaped stands interest of counting 2 capacity for each (and there was space on both sides), despite the hint

I linked the most recent example in my initial post which is the area I linked in the comment you quoted. Other similar cases are already solved and I will not have the time to dig them up again.