practical-recommender-systems / moviegeek

A django website used in the book Practical Recommender Systems to illustrate how recommender algorithms can be implemented.
MIT License
901 stars 360 forks source link

User mean is cancelled out in recommenders #73

Open ksiar137 opened 6 months ago

ksiar137 commented 6 months ago

When you calculate the predictions for in the recommenders, the user mean is calculated, but then substracted from each rating, resulting in it being canceled out. Example:

    **user_mean** = sum(movie_ids.values()) / len(movie_ids)

    candidate_items = Similarity.objects.filter(Q(source__in=movie_ids.keys())
                                                & ~Q(target__in=movie_ids.keys())
                                                & Q(similarity__gt=self.min_sim)
                                                )
    candidate_items = candidate_items.order_by('-similarity')[:self.max_candidates]

    recs = dict()
    for candidate in candidate_items:
        target = candidate.target

        pre = 0
        sim_sum = 0

        rated_items = [i for i in candidate_items if i.target == target][:self.neighborhood_size]

        if len(rated_items) > 1:
            for sim_item in rated_items:
                r = Decimal(movie_ids[sim_item.source] - **user_mean**)
                pre += sim_item.similarity * r
                sim_sum += sim_item.similarity
            if sim_sum > 0:
                recs[target] = {'prediction': Decimal(**user_mean**) + pre / sim_sum,
                                'sim_items': [r.source for r in rated_items]}

I illustrated what is happening with equations. You can see that the user mean is cancelled out and the result we get is not the same formula as the one in the book: image

Formula in the book:

image