Add a leaderboard (gamification)

misaugstad commented 4 years ago

This concept originally came up in #1315 and #73, and I'll try to move the relevant content over here so it's all in one place.

We've talked about including a leaderboard (or multiple leaderboards for different stats) to the landing page, the user dashboard, to the results page, or possibly on its own page. We've talked both about individual leaderboards and about team leaderboards (particularly useful for mapathons).

We've also talked a bit about how it is very motivating for some users, but can be demotivating for other users (when you see how long it would take you to make it onto the leaderboard, for example).

Here are some leaderboard examples:

Here are some ideas suggested by @jonfroehlich for what the leaderboard could track:

Most walked overall and per neighborhood
Most labels overall, per type, and per neighborhood
Most accurate overall and possibly per label type

jonfroehlich commented 4 years ago

In email, @gari01234 said he'd be happy to sketch out a few ideas for designs.

My quick mockup

I just mocked up something quickly in PowerPoint. LeaderboardMockup.pptx.

Lots to think about regarding what metrics to include in the leaderboard and what metrics to emphasize to establish leadership (is it total distance audited? is it number of missions completed? is it accuracy?)

The avatars could be auto-generated from a curated list from Noun Project (or something) related to disability and pedestrian activities (bicycling, skateboarding, walking) or just not included at all. (If we include avatars, would be nice to have users be able to customize them).

The usernames could be hyperlinked so that when you click on them, you go to their 'contributions' page?

Placement

As noted in email, I think we have (some) leaderboards on the landing page (main page) as well as a separate page (linked at top, maybe called "Leaderboards").

The landing page leaderboard should capture overall leadership. The leaderboard page could maybe have additional leaderboards like emphasizing top leaders in finding obstacles or curb ramps, top validators, most accurate auditors, top leaders per neighborhood, etc.

Other examples

Here's University bioQuest that @misaugstad mentioned above:

AXS Map

There are lots more examples (both good and bad) by searching for "leaderboards" in Google Images

Kaggle.com

jonfroehlich commented 4 years ago

In follow-up conversation with @gari01234, he mentioned that this would also help with mapathons and supporting organizations and school programs in hosting competitions, etc. I'm excited about this!

rpechuk commented 4 years ago

Made a mock up: Any feedback? @misaugstad @jonfroehlich

jonfroehlich commented 4 years ago

I like it! Great as an MVP for sure. I think Gari was planning on doing some mocks too

Sent from my iPhone

On Sep 15, 2020, at 10:45 AM, rpechuk notifications@github.com wrote:

Made a mock up:

Any feedback? @misaugstad @jonfroehlich

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

rpechuk commented 4 years ago

Question for everyone involved, how are we going to be counting "score". Do we plan on having multiple different leaderboards, one for each statistic, or are we going to do one leaderboard and if so how are we going to add up the value of each stat and combine it to one overall score.

misaugstad commented 4 years ago

I think for the MVP what we are going to start with is just using total distance audited. I think think we have any specific plans for a "score" system. It might be multiple leaderboards for different stats. My only other things to say about the mock is that the Leaderboard title looks a bit odd, not totally sure why. Maybe it's a little too close to the chart or a little too close to the navbar? idk. And we should add a space between the number and the distance unit ("3.2km" -> 3.2 km").

One other thing to think about: how many labels should the user have validated before we display the accuracy score? And what should we display when there is not enough data?

jonfroehlich commented 4 years ago

The score is accuracy, which I think should be:

Total labels validated correct / (total labels validated - total labels validated unsure)

So, this accuracy estimate only penalizes you for labels marked as “disagree”

Later, we’ll want to do more sophisticated scoring (for example, for those labels w multiple validations, use majority vote or some kind of weight)

Sent from my iPhone

On Sep 16, 2020, at 3:29 PM, Mikey Saugstad notifications@github.com wrote:

I think for the MVP what we are going to start with is just using total distance audited. I think think we have any specific plans for a "score" system. It might be multiple leaderboards for different stats. My only other things to say about the mock is that the Leaderboard title looks a bit odd, not totally sure why. Maybe it's a little too close to the chart or a little too close to the navbar? idk. And we should add a space between the number and the distance unit ("3.2km" -> 3.2 km").

One other thing to think about: how many labels should the user have validated before we display the accuracy score? And what should we display when there is not enough data?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

misaugstad commented 4 years ago

@jonfroehlich but do you want the MVP leaderboard to take the top 10 users with the best accuracy or most audited distance?

jonfroehlich commented 4 years ago

Ah, right! We need to be careful here. Distance alone does not matter. So I think we do need to come up with some “score” or we could incentivize the wrong behavior.

Sent from my iPhone

On Sep 16, 2020, at 4:00 PM, Mikey Saugstad notifications@github.com wrote:

@jonfroehlich but do you want the MVP leaderboard to take the top 10 users with the best accuracy or most audited distance?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

misaugstad commented 4 years ago

sure... but I feel like coming up with a score is something that we probably would want to think kind of hard about. what if we started with something that is less likely to incentivize poor auditing, like label count?

jonfroehlich commented 4 years ago

Sure, let's do 'label count' to start but also, obviously, surface accuracy where accuracy is defined as:

Total labels validated correct / (total overall labels validated - total labels validated unsure)

@rpechuk, I think we'll likely want to have two leaderboards: one for overall and one for some shorter time period (like a week, though not sure "a week" is the right period and might get awkward when we have very few auditers for that week...)

Side-by-side layout:

Vertical layout:

And we'll want to filter out 'admins' and 'researchers' from the leaderboard, obviously. :)

gari01234 commented 4 years ago

I hope it's not too late to share a couple of mocks.

imagen imagen imagen imagen

misaugstad commented 4 years ago

All these mocks are great! I think for the MVP we should probably stick with something close to the mock that @rpechuk added. Though I really like the tabbed design for the week vs lifetime boards in @gari01234 's mocks! And that should be quick to include.

jonfroehlich commented 4 years ago

While I like the tabbed design too, it requires clicks to use (and also diligence to observe that the 'tab' functionality even exists). For the landing page, I think I still prefer a side-by-side design where you can see overall and weekly. Of course, the downside there, again, is that sometimes the weekly view will look somewhat empty given low user counts for the week (so hiding it behind a tabbed interaction may be preferred).

Also, can we get an update on the ETA for this from @rpechuk. It's a high priority Issue.

Remember, we want to inclusions of the leaderboard:

One on the landing page as a 'module' (similar to the other parts of the page)
One as it's own page—should reuse much of the same code from above but the leaderboard page itself may have more functionality. For this, do we put Leaderboard in the Data menu?

jonfroehlich commented 4 years ago

One thing missing from @gari01234's mocks, which I think is crucial, is accuracy. We have a problem with our users not knowing how well they're doing. This info needs to get added to their dashboard too.

misaugstad commented 4 years ago

Oh if a user is not on the leaderboard, do we need to show where they are at in this MVP? Maybe we don't need to show them their rank, but maybe the current user's stats should be shown at the bottom of they aren't already on the board? @jonfroehlich do you think that's necessary for an MVP?

We should also make it clear somehow that we are using labels (for now) as the ranking metric. Maybe we only need to do that by having it as the first column after the username?

jonfroehlich commented 4 years ago

Oh if a user is not on the leaderboard, do we need to show where they are at in this MVP? Maybe we don't need to show them their rank, but maybe the current user's stats should be shown at the bottom of they aren't already on the board? @jonfroehlich do you think that's necessary for an MVP?

Not necessary imo for the MVP but a good idea (and something to also add to user's own dashboard).

We should also make it clear somehow that we are using labels (for now) as the ranking metric. Maybe we only need to do that by having it as the first column after the username?

Agree. Let's just have labels as first column after username. So, let's have labels, missions, distance, and then accuracy.

jonfroehlich commented 4 years ago

We've previously defined accuracy as:

number of labels validated correct / (number of labels validated - number of labels marked as unsure)

But there is an edge case for labels that have been validated by multiple users. On IM with @misaugstad, we discussed this. I originally proposed:

For cases where multiple people have voted, it's something like a weighted calculation. Here's the calc per label: (number of validators who said it was correct - number of validators who said it was incorrect) / (number of times label was validated - number of times label was rated unsure)

But then Mikey pointed out that this was needlessly complex, so we settled on a simple majority vote scheme like this:

accuracy = (number of labels rated correct) / (number of total labels validated - number of labels marked unsure - number of labels tied in majority vote)

So, we treat agree and disagree ties as "unsure". And in those cases where "unsure" has the majority, then we treat the label as unsure.

UPDATE: we also need to filter out the user's own validations for their own labels (which they can currently do in /labelmap) or we simply prevent users from validating their own labels (e.g., when they attempt to validate their own label, we pop up a message and say this is not possible or we don't even give them the option (hide the buttons in those cases))

UPDATE 2: we also need to filter out users marked as "low quality", right?

misaugstad commented 4 years ago

Do we want "this week" to be the week starting from exactly one week before the user loaded the page? Or do we want to pick a day that the stats reset?

jonfroehlich commented 4 years ago

This week should be a fixed day—like Sunday at 12AM or Monday at 12AM.

misaugstad commented 4 years ago

Here's the first pass look at the leaderboard that Ron and I made. Still a WIP, but I wanted to enumerate the changes I see us needing soon on Github.

Screenshot from 2020-09-25 15-38-58

[x] If the accuracy is None we should show "N/A" or something instead of "0%"
[x] The distance units are listed as "km" right now, but those numbers are in meters. Lets start by just changing the unit to "m" or "meters". We will probably want to dynamically change from meters to km at some threshold in the future, but not necessary for MVP
[x] On that note, we need to convert to feet if the language is set to US English
[x] Anywhere with an accuracy of "N/A" we should have a tooltip saying that we only show accuracy if at least 10 of your labels have been validated.
[x] We probably want a tooltip over the word "Weekly" or something that says that the stats reset every Sunday morning at 12 AM.
[x] We need to check for usernames that are actually valid email addresses and string the @ and everything after
[x] Need to add to the landing page as well

rpechuk commented 4 years ago

@jonfroehlich @misaugstad @gari01234 Thought's?

jonfroehlich commented 4 years ago

Woohoo. Looks great! Great work @rpechuk. A few things.

I think the leaderboard should be located under "Your work is making a difference" on the landing page rather than under "Let's create a path for everyone."
I think we should exclude researchers, admins, and owners
When people use their full names rather than a username, I feel like we should only use the first or lastname (feels like a privacy violation otherwise)—especially since we're shipping this feature after people registered their usernames
It looks like we have at least one turker (possibly) in your list? The "iPx9zmgaqh..." user. Are we going to include turkers? It makes this list difficult to read.
I don't like "lifetime" in the phrase "Lifetime Leaderboard". We can do "Overall Leaderboard" or simply "Leaderboard" for this
You need to use the right font for this "Leaderboard" header to be consistent with other parts of the page
Under the header, write something like "Leaders are calculated based on their labels, distance, and accuracy." (yes, I know it's just labels right now).

rpechuk commented 4 years ago

@misaugstad knows more about the second, third, and fourth points and for the rest I will make it happen

misaugstad commented 4 years ago

The only other thing besides what Jon mentioned is that I would remove the colon from the titles.

I think we should exclude researchers, admins, and owners

@jonfroehlich we had talked about this last week and decided to keep all these user groups. We are at the top of the leaderboard in Newberg, but I imagine we won't be at the top a lot of the time. The other issue is that someone like @gari01234 is an admin now as well. Oh and the people labeled "researcher" are often interns, who I think would enjoy seeing themselves on the leaderboard and wouldn't be a huge issue.

Two potential solutions: exclude only "Owners" which would be @jonfroehlich and I. Another option (which we may use for other reasons in the future) is to add a new "Collaborator" role or something. We would assign this to someone like @gari01234 who we want to give the ability to see the admin dashboard, but maybe that role does not get excluded from leaderboards, and we could also limit their ability to make edits on the admin dashboard (for example, we could prevent them from modifying other users' roles).

When people use their full names rather than a username, I feel like we should only use the first or lastname (feels like a privacy violation otherwise)—especially since we're shipping this feature after people registered their usernames

I do still kind of think that when you fill out a username, you assume that it will be public. At least with email addresses, that was easy enough to detect programmatically, but first/last name is completely manual. However, I do get that we are adding this feature after people already signed up...

We could potentially email everyone to let them know that we are adding features like a leaderboard that could make their username publicly visible, and that they should respond if they would like us to modify their username given that. We would also ideally send their current username in the email so they know whether they want something changed. We could also manually look through the list of usernames and only send that email to users who look like they have a first + last name in their username? We could be even more restrictive, too, where we only email people who have met those criteria and who have contributed data.

It looks like we have at least one turker (possibly) in your list? The "iPx9zmgaqh..." user. Are we going to include turkers? It makes this list difficult to read.

This is actually an anomaly that we don't have to worry about. This was an intern who accidentally tested on a prod server as an anonymous user, so we changed that account's role to be "Researcher". But the username still has the randomly generated string of characters we give to anon users. I think that @rpechuk has an older database dump from Newberg, so this user would not be on the leaderboard on the production site.

This also makes me think: should we exclude users from the leaderboard that I have manually marked as "low quality"?

I don't like "lifetime" in the phrase "Lifetime Leaderboard". We can do "Overall Leaderboard" or simply "Leaderboard" for this

I agree if it's just that leaderboard by itself, but if the weekly leaderboard is right next to it, I feel like we should differentiate. And I do think "Lifetime" is relatively standard, but so is "Overall".

Under the header, write something like "Leaders are calculated based on their labels, distance, and accuracy." (yes, I know it's just labels right now).

Does this need to be written just below the header? Or can it be like an asterisk below the leaderboard itself? Or as a tooltip? And we could always say that it's based on labels for now and then update the text as we update the code..?

jonfroehlich commented 4 years ago

Thanks Mikey.

Let's just exclude owners for now.

Re: full names. Not sure what to say about this. We should discuss further. I don't think emailing is necessarily the answer.

Re: lifetime. I just don't like that word. Something about how it's personifying an abstract hing. Let's use "Overall Leaders" or "Overall Leaderboard"

Re: writing under the headers Having some writing under the headers is consistent with all other parts of the landing page, which is why I suggested it (and people will want to know how it's calculated). Once Ron switches his fonts over to the correct ones, we can take a look to see how it is. I'd still like to say this: "Leaders are calculated based on their labels, distance, and accuracy". I think we should exclude users marked with "Low Quality."

gari01234 commented 4 years ago

We could potentially email everyone to let them know that we are adding features like a leaderboard that could make their username publicly visible, and that they should respond if they would like us to modify their username given that. We would also ideally send their current username in the email so they know whether they want something changed. We could also manually look through the list of usernames and only send that email to users who look like they have a first + last name in their username? We could be even more restrictive, too, where we only email people who have met those criteria and who have contributed data.

What if we let people change their username, as can be done from other websites, for example: https://github.com/settings/profile

This way, if users appear on the leaderboard and are not satisfied with the name that appears publicly, they can change it.

rpechuk commented 4 years ago

How does this look: @jonfroehlich @misaugstad

jonfroehlich commented 4 years ago

Font for header and text beneath header needs to be Raleway and not Adelle. Just be consistent with other parts of page.

Update: looks like header text on landing page is Raleway and content text is Adelle. Again, just be consistent.

On Wed, Sep 30, 2020 at 11:02 AM rpechuk notifications@github.com wrote:

How does this look: [image: image] https://user-images.githubusercontent.com/39865166/94722430-262e6e00-030c-11eb-95a4-6e87825ea2b1.png [image: image] https://user-images.githubusercontent.com/39865166/94722510-41997900-030c-11eb-82cf-b8846d8e3a3a.png @jonfroehlich https://github.com/jonfroehlich @misaugstad https://github.com/misaugstad

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/SidewalkWebpage/issues/2177#issuecomment-701550948, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAML55LLIOIWQLLH5TNURS3SINXCPANCNFSM4O2AWGIQ .

-- Jon E. Froehlich https://jonfroehlich.github.io/ Associate Professor Paul G. Allen School of Computer Science & Engineering University of Washington http://makeabilitylab.io @jonfroehlich https://twitter.com/jonfroehlich - Twitter Help make sidewalks more accessible: http://projectsidewalk.io

rpechuk commented 4 years ago

I thought it was but I was testing it with other fonts and forgot to change it back lol.

misaugstad commented 4 years ago

I just pushed a commit that removes Owners (Jon and I) from the leaderboard. what else is left to do here?

jonfroehlich commented 4 years ago

I'm also happy to help with this Issue (and PR) so that we can meet our commitment to SPGG in a timely fashion. Please let me know what we can do @rpechuk! Great work so far.

rpechuk commented 4 years ago

@jonfroehlich nothing right now i'm pretty sure we are done mikey is doing some final testing rn.

jonfroehlich commented 4 years ago

Great, thanks Ron!

On Tue, Oct 6, 2020 at 8:53 PM rpechuk notifications@github.com wrote:

@jonfroehlich https://github.com/jonfroehlich nothing right now i'm pretty sure we are done mikey is doing some final testing rn.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/SidewalkWebpage/issues/2177#issuecomment-704675218, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAML55IB66FGNV7KZDBGOQ3SJPQZXANCNFSM4O2AWGIQ .

-- Jon E. Froehlich https://jonfroehlich.github.io/ Associate Professor Paul G. Allen School of Computer Science & Engineering University of Washington http://makeabilitylab.io @jonfroehlich https://twitter.com/jonfroehlich - Twitter Help make sidewalks more accessible: http://projectsidewalk.io

misaugstad commented 4 years ago

@gari01234 would you be able to provide translations for us for the following phrases we are using for the leaderboard?

Leaderboard Overall Leaderboard Leaders are calculated based on their labels, distance, and accuracy. Accuracy is only shown if at least 10 of your labels have been validated Weekly Leaderboard Stats reset every Sunday morning at 12 AM Pacific

misaugstad commented 4 years ago

I am putting the text into conf/messages.en to prepare for internationalization, and I am moving the navbar link from the username drop down to the data drop down. Here's the look in the navbar, lmk @jonfroehlich if you want me to reorder it. Screenshot from 2020-10-07 12-56-41

misaugstad commented 4 years ago

Also changing the CSS class names from camelCase to whatever-this-is-called

jonfroehlich commented 4 years ago

I'd move it to just before Sidewalk API (so after Results Map and Label Map).

Jon

On Wed, Oct 7, 2020 at 1:14 PM Mikey Saugstad notifications@github.com wrote:

Also changing the CSS class names from camelCase to whatever-this-is-called

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/SidewalkWebpage/issues/2177#issuecomment-705168978, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAML55IUPDH6KYKSFEBDSXLSJTD23ANCNFSM4O2AWGIQ .

-- Jon E. Froehlich https://jonfroehlich.github.io/ Associate Professor Paul G. Allen School of Computer Science & Engineering University of Washington http://makeabilitylab.io @jonfroehlich https://twitter.com/jonfroehlich - Twitter Help make sidewalks more accessible: http://projectsidewalk.io

gari01234 commented 4 years ago

@gari01234 would you be able to provide translations for us for the following phrases we are using for the leaderboard?

*Leaderboard = Tabla de posiciones

Other options: Tabla de líderes Tabla de clasificación Tabla de puntuaciones Tabla de posiciones Top 10 de ususarios

*(This is not an easy one)

Overall Leaderboard = Tabla de posiciones general Weekly Leaderboard = Tabla de posiciones semanal Leaders are calculated based on their labels, distance, and accuracy. = Las posiciones se calculan en base a las etiquetas, distancia y precisión Accuracy is only shown if at least 10 of your labels have been validated = La precisión sólo se muestra si al menos 10 de tus etiquetas han sido validadas Stats reset every Sunday morning at 12 AM Pacific = Las estadísticas se restablecen todos los domingos por la mañana a las 12:00 a.m. (PST)

misaugstad commented 4 years ago

Las estadísticas se restablecen todos los domingos por la mañana a las 12:00 a.m. (PST)

I'm assuming PST here means "Pacific Standard Time"? If so, is "PT" for "Pacific Time" also a standard acronym that ppl will understand? To avoid confusion b/w PST and PDT

jonfroehlich commented 4 years ago

Right, we can't say PST because it will shift between PDT and PST.

Jon

On Thu, Oct 8, 2020 at 1:35 PM Mikey Saugstad notifications@github.com wrote:

Las estadísticas se restablecen todos los domingos por la mañana a las 12:00 a.m. (PST)

I'm assuming PST here means "Pacific Standard Time"? If so, is "PT" for "Pacific Time" also a standard acronym that ppl will understand? To avoid confusion b/w PST and PDT

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ProjectSidewalk/SidewalkWebpage/issues/2177#issuecomment-705808811, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAML55NQLJJQH6QOJ5NAF4LSJYPA5ANCNFSM4O2AWGIQ .

-- Jon E. Froehlich https://jonfroehlich.github.io/ Associate Professor Paul G. Allen School of Computer Science & Engineering University of Washington http://makeabilitylab.io @jonfroehlich https://twitter.com/jonfroehlich - Twitter Help make sidewalks more accessible: http://projectsidewalk.io

ProjectSidewalk / SidewalkWebpage