[Paper] Harnessing wisdom of the crowds dynamics for time-dependent reputation and ranking

https://www.scopus.com/record/display.uri?eid=2-s2.0-70349843492&origin=resultslist&sort=plf-f&src=s&nlo=&nlr=&nls=&sid=e475b6f38553cce759403177cd1e0967&sot=a&sdt=a&cluster=scosubjabbr%2c%22COMP%22%2ct&sl=53&s=TITLE-ABS-KEY%28user+reputation+system+search+ranking+%29&relpos=22&citeCnt=3&searchTerm=

The "wisdom of the crowds" is a concept used to describe the utility of harnessing group behaviour, where user opinion evolves over time and the opinion of the masses collectively demonstrates wisdom. Web 2.0 is a new medium where users are not just consumers, but are also contributors. By contributing content to the system, users become part of the network and relationships between users and content can be derived. Example applications are collaborative bookmarking networks such as del.icio.us and file sharing applications such as YouTube and Flickr. These networks rely on user contributed content, described and classified using tags. The wealth of user generated content can be hard to navigate and search due to difficulties in comparing documents with similar tags and the application of traditional information retrieval scoring techniques are limited. Evaluating the time evolving interests of users may be used to derive quality of content. In this paper, we propose a technique to rank documents based on reputation. The reputation is a combination of the number of bookmarkers, the reputation of the bookmarking user and the time dynamics of the document Experimental results and analysis are presented on a large collaborative IBM bookmarking network called Dogear. © 2009 IEEE.

This was a good one! Some notes below:

Presents an approach to rank documents based on:

The amount of readers
The reputation of author
The time dynamics of reader consumption
The time dynamics of consumption of documents contributed by the user (kind of indirect, since if this user has less frequent readers on its docs, its rep will decrease, so its contribuition to the main doc - which they are consuming - will be smaller)

Reputation values scale between 0 and 1

(1) Every time a user consumes a doc from an author, the author gains reputation according to:

    newRep = oldRep + (1 - oldRep) * repReward

repReward is a constant between 0 and 1 and should consider the number of entities in the system "If the number of expected consumers is in the order of hundreds or thousands, then an overly high value of repReward will potentially cause popular content to quickly converge towards 1 making it difﬁcult to differentiate between similarly popular content."

(2) Every time a user consumes a doc, the doc gains "reputation" - meaning popularity in this case - according to the same formula of (1)

    newRep = oldRep + (1 - oldRep) * repReward

(3) In order to take time dynamics into account, rep should decrease over time, so that a "rich-get-richer" paradigm can be avoided. This is achieved by the following equation (both for users and for docs):

    newRep = oldRep * decayCoeff^k

decayCoeff represent how much the rep will change, and k is the amount of time units that have passed since the last rep update, i.e for a time unit of "days", k will be 0 in the first 24h, 1 in the next day, 7 in a week, and so on. This decouples the algorithm from the logistics, since the algorithm can now run in a fixed frequency, independently from the time units, and every time it re-calculates, it will give an accurate value. However, if for example the time unit is day, and the algorithm updates every week only, there will be an offset of 6 days where the value will be outdated

(4) Users with higher reputation matter more when calculating the doc rep changes:

    newRep = oldRep * repConsumer * B

B is a constant [0, 1] representing to what extent the user rep will influence the doc rep

This system can be adapted and applied in zerozero.live if we map user inputs in an event as documents and users as users (well, ofc). However, we will be ranking users instead of "documents" -inputs-, even though they will also have reputation values. See below:

Every time a user agrees with an input, he will improve the input's rep according to (2) and (4).
Every time a user disagrees with an input (either by inputting a real-conflicting input or reporting as false/inaccurate) he will worsen the input's rep according to (inverse 2) and (4)
Every time a user submits a falsely-conflicting input, it should act as an explicit agreement with the other user's input, so it should count more, according to an explicitAgreementBonus constant
The user gains reputation according to the average of its inputs's reputations.
Each user has a reputation decay according to (3), the time unit should be 1 week since there's at least one relevant game per week. This prevents users that generate a lot of inputs in a single game to enjoy their reputation boost for many more games, since they need to be consistent every week - it matters more if they make an input every week than 20 inputs once every 2 or more weeks. This decay is on a higher level than the events, creating 20 inputs in an event is roughly the same as 1 input in an event (since the events last around 90min)
The reputation values are updated at the end of each event, according to the event's history

imnotteixeira / dissertation

[Paper] Harnessing wisdom of the crowds dynamics for time-dependent reputation and ranking #7