Closed fohrloop closed 2 days ago
Hi @fohrloop , thanks for reporting this, and apologies for the delay. I will look into this as soon as I get a chance & get back to you.
I've investigated this issue.
I agree with you that the result is not very intuitive, but upon closer look this is actually expected behavior for lsr_pairwise
(and likely for lsr_*
algorithms in general). I've verified that the solution you get is the correct one, i.e., the solution corresponds to the stationary distribution of the Markov chain implied by the data and alpha
.
For ilsr_*
, in the specific output you show (for alpha=1e-4
) the fact that the curve is seemingly concave is likely due to the fact that we stop the iterative process before it has fully converged. In general you might also get somewhat unintuitive results for ilsr_*
algorithms, e.g., in the example above when setting alpha to something larger, like 0.1.
For small datasets where some items never "win" or never "lose" (as is the case for items 0
and 14
in your example) I recommend using the opt_
functions instead; I think the output is more intuitive.
Thanks a lot @lucasmaystre for taking a look and providing throughout explanation! I've been using the opt_pairwise
successfully in my use case, and good to know that it would be more suitable for smaller (or "sparser") datasets!
Reproducible example
Using 15 items (14 pairs), ordered in a line. The expectation is that the scores would form a straight line.
this shows:
I would expect the scores to be a continuous straight line, so the first score seems to be off. For example with the
ilsr_pairwise
you get:which is a curve with some second derivative, but it turns into ~straight line when a very small alpha (1e-23) is used.
Using choix 0.3.5, scipy 1.14.1 on CPython 3.12.6.