I noticed odd behavior in leave_k_out_split - specifically, under certain circumstances (many rows with one value?), the number returned for the withheld test set value is different from the actual value. This causes train + test to fail to reconstruct the original matrix.
Created at 2022-12-23 13:59:28 CST by reprexlite v0.5.0
I believe the issue is in _take_tails - the returned test_idx array has multiple copies of the first user index returned, so we end up with a test set value that is copied multiple times.
(As an aside, the call to _take_tails when shuffled=True does not pass on the rng, so the random state cannot be maintained.)
I noticed odd behavior in leave_k_out_split - specifically, under certain circumstances (many rows with one value?), the number returned for the withheld test set value is different from the actual value. This causes
train + test
to fail to reconstruct the original matrix.Script to reproduce:
Created at 2022-12-23 13:59:28 CST by reprexlite v0.5.0
I believe the issue is in
_take_tails
- the returnedtest_idx
array has multiple copies of the first user index returned, so we end up with a test set value that is copied multiple times.(As an aside, the call to
_take_tails
whenshuffled=True
does not pass on therng
, so the random state cannot be maintained.)