Wrong results with uneven rankings

Hi!

We recently published a paper about handling ties in RBO, partly because, as you well know, it's not clear from the original paper how to do this. So we took a deep look at this, provided a solution to this gap (what we called $RBO^w$), plus two other variants to handle ties more in line with the Statistics literature ($RBO^a$ and $RBO^b$).

We of course took a look at the approach you took in your code, but we're afraid it has unintended consequences that make the whole RBO computation wrong whenever rankings are uneven (different length), regardless of whether there are ties or not.

Take for example these rankings:

Even without ties: ['N', 'H', 'M', 'A'] and ['C', 'G', 'N', 'A']
Even with ties: [{'N', 'H'}, 'M', 'A'] and ['C', 'G', {'N', 'A'}]
Uneven without ties: ['N', 'H', 'M', 'A', 'C', 'F', 'L'] and ['C', 'G', 'N', 'A']
Uneven with ties: [{'N', 'H'}, 'M', {'A', 'C', 'F'}, 'L'] and ['C', 'G', {'N', 'A'}]

The table below compares the results of the original implementation by Webber, our own implementation, and yours. In your case we find 2 places where you mention changes and when to comment/uncomment lines so that we get one result or another: lines 96 vs 98, and lines 227 vs 228:

Length	Ties	Webber	Ours (w)	96 & 228 (base)	99 & 228	96 & 227	99 & 227
even	no	0.3915	0.3915	0.3915	0.3915	0.3915	0.3915
even	yes	-	0.3876	0.405	0.5265	0.405	0.54
uneven	no	0.4904	0.4904	0.451	0.5069	0.418	0.4904
uneven	yes	-	0.5156	0.4661	0.7627	0.4562	0.81

As you can see, while the results are always correct with even rankings, the current code base gives incorrect results with uneven rankings (would need to bring back lines 99 and 227). The results with ties are all different, but we intended to calculate different things anyway (I think).

So we wanted to let you know about this, because the modification for ties makes the no-ties results wrong. But then again, we strongly suggest to take a look at the paper because we dig into this problem in detail. Indeed, the formulations with ties following the original idea is different from what you implemented, besides presenting two other RBO variants. Again, our full implementation is available in Python and in R.

Cheers

dlukes / rbo

Wrong results with uneven rankings #5