NicolasHug / Surprise

A Python scikit for building and analyzing recommender systems
http://surpriselib.com
BSD 3-Clause "New" or "Revised" License
6.31k stars 1k forks source link

Add shrinkage option to pearson similarity #137

Open ODemidenko opened 6 years ago

ODemidenko commented 6 years ago

Shrunk similarity, as presented in pearson_baseline algorithm may be relevant not only for the given similarity measure, but for other approaches as well, as it is absolutely orthogonal to the baseline concept or to the similarity measure. It might be perfectly suitable for kNNWithMeans, with cosine similarity.

I offer to decouple it from pearson-baseline similarity and make it available for any kind of item-item algorithm.

NicolasHug commented 6 years ago

I have no problem with this but please pay attention to the following points:

ODemidenko commented 6 years ago

Shrinkage is only relevent when the original similarity can be considered to be drawn from a normal distribution with zero mean.

I agree. we shouldn't introduce shrinkage for all similarity measure, as it makes sense only for zero-centered measures (and I believe we should provide options that have at least some sense). We need another similarity measure which uses similarity centered around zero - "adjusted cosine", as proposed in #135.