leave_k_out_split produces incorrect values in test set

I noticed odd behavior in leave_k_out_split - specifically, under certain circumstances (many rows with one value?), the number returned for the withheld test set value is different from the actual value. This causes train + test to fail to reconstruct the original matrix.

Script to reproduce:

import implicit

implicit.__version__
#> '0.6.2'

from implicit.evaluation import leave_k_out_split
from scipy import sparse

ratings = sparse.csr_matrix(
    [[3, 2, 1, 0], [1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0], [0, 1, 1, 1]]
)

train, test = leave_k_out_split(ratings, K=1, random_state=42)

diff = (train + test) - ratings

diff.nnz
#> 1

train.todense()
#> matrix([[3, 2, 0, 0],
#>         [1, 0, 0, 0],
#>         [0, 1, 0, 0],
#>         [0, 0, 1, 0],
#>         [0, 1, 0, 1]])

test.todense()
#> matrix([[0, 0, 4, 0],
#>         [0, 0, 0, 0],
#>         [0, 0, 0, 0],
#>         [0, 0, 0, 0],
#>         [0, 0, 1, 0]])

diff.todense()
#> matrix([[0, 0, 3, 0],
#>         [0, 0, 0, 0],
#>         [0, 0, 0, 0],
#>         [0, 0, 0, 0],
#>         [0, 0, 0, 0]])

^{Created at 2022-12-23 13:59:28 CST by reprexlite v0.5.0}

I believe the issue is in _take_tails - the returned test_idx array has multiple copies of the first user index returned, so we end up with a test set value that is copied multiple times.

(As an aside, the call to _take_tails when shuffled=True does not pass on the rng, so the random state cannot be maintained.)

benfred / implicit

leave_k_out_split produces incorrect values in test set #639