ocbe-uio / BayesMallows

R-package for Bayesian preference learning with the Mallows rank model.
https://ocbe-uio.github.io/BayesMallows/
GNU General Public License v3.0
21 stars 9 forks source link

Ulam distance is wrong? #417

Closed osorensen closed 4 months ago

osorensen commented 4 months ago

Thanks to Xavier Benavides for pointing out. Here is some reasoning related to the correction, which is in #418.

Definition of Ulam distance

For two rankings r1 and r2 with length n, the Ulam distance distance equals n minus the longest common subsequence of the orderings corresponding to the rankings.

Details are in the following two screenshots from Irurozki, Calvo, and Lozano (2016).

First, note that sigma is an ordering.

image

Next follows the definition of Ulam distance in terms of sigma. Note that there is an error in the example, since the longest increasing subsequence in sigma is 13457, of size 5, so the Ulam distance i 7-5=2.

image

Testing the implementation

The following two orderings were used when discovering the bug: sigma = [51432] and pi = [31245]. Their longest common subsequence is 2, which is either 32 or 12 or 14. Hence, the Ulam distance should be 5 - 2 = 3.

In the previous implementation we would get 2, but now we correctly get 3.

library(BayesMallows)
sigma <- c(5, 1, 4, 3, 2)
pi <- c(3, 1, 2, 4, 5)

ranking1 <- create_ranking(sigma)
ranking2 <- create_ranking(pi)

compute_rank_distance(ranking1, ranking2, "ulam")
#> [1] 3

Created on 2024-07-02 with reprex v2.1.0

Corrected version of the example from Irurozki et al. (2016):

library(BayesMallows)

sigma <- c(2, 1, 3, 6, 4, 5, 7)
pi <- c(1, 2, 3, 4, 5, 6, 7)
ranking1 <- create_ranking(sigma)
ranking2 <- create_ranking(pi)

compute_rank_distance(ranking1, ranking2, "ulam")
#> [1] 2

Created on 2024-07-02 with reprex v2.1.0

osorensen commented 4 months ago

Update, the original implementation was indeed correct. We must just be careful to keep in mind that the distance is defined for orderings, not rankings.

Here for the current CRAN version:

library(BayesMallows)
packageVersion("BayesMallows")
#> [1] '2.2.1'
sigma <- c(5, 1, 4, 3, 2)
pi <- c(3, 1, 2, 4, 5)

ranking1 <- create_ranking(sigma)
ranking2 <- create_ranking(pi)

compute_rank_distance(ranking1, ranking2, "ulam")
#> [1] 3

Created on 2024-07-02 with reprex v2.1.0

library(BayesMallows)
packageVersion("BayesMallows")
#> [1] '2.2.1'
sigma <- c(2, 1, 3, 6, 4, 5, 7)
pi <- c(1, 2, 3, 4, 5, 6, 7)
ranking1 <- create_ranking(sigma)
ranking2 <- create_ranking(pi)

compute_rank_distance(ranking1, ranking2, "ulam")
#> [1] 2

Created on 2024-07-02 with reprex v2.1.0