Wiki Suggestions Enhancement

Adding a new feature! In order to create better suggestions, the use of probabilistic computations for high much a user may like a suggested Wikipedia pages is required. This will require creating a cost function that takes in a given document and will return how much a user may like it on a scale of -1 to 1. Before this cost function exists, we need to have a training set (the documents the user has already seen). Currently, the scales that we can use are the following:

Vote: On a scale of 1-5, how much someone enjoyed the page they visited
Focus time: How long they remained on that page.
Vote time: How long into their time on the page that they voted on it.

Using these scales we can calculate the real value of the training set documents. Another consideration for what we can rank on would be related to the content of the pages, but that is not part of these initial feature specifications.

Once we have the real values on a scale of -1 (the user really disliked it) to 1 (the user loved it), then we can compute a weight vector. So if we have three documents, the following example could be a possibility:

Document Vectors	Weight Vector	Estimated Value
vote₁ focus-time₁ vote-time₁	w₁	-1
vote₂ focus-time₂ vote-time₂	w₂	0.1
vote₃ focus-time₃ vote-time₃	w₁	1

After we compute the weight value for w1, w2, w3 then we have a cost function we can use to check for the user estimated preference on a new page.

charlie-map / wiki-suggestor-service

Wiki Suggestions Enhancement #11