Plackett-Luce vs Bradley-Terry on rank data

ttrodrigz commented 3 years ago

This is not a bug/issue, so I hope you don't mind me posting this here. Feel free to close this thread if this is annoying.

Before I had learned about the Plackett-Luce model, my intuition - which I know is not unique - for analyzing ranking data was to arrange the data into pairwise comparisons and use Bradley-Terry. I recently came across a 2017 post on Stack Exchange where it is pointed out that this method of arranging the data via "rank-breaking" and using Bradley-Terry is perhaps not a statistically valid thing to do.

I do not have the mathematical chops to intuit this on my own and I couldn't find any resources online, so I was wondering if it could be explained why Bradley-Terry ought to be avoided for ranking data (aside from the fact that Plackett-Luce is specifically for ranking data.)

Thank you for all the work you've put into BradleyTerry2 and PlackettLuce packages, they are incredibly useful.

hturner commented 3 years ago

The problem of converting rankings into paired comparisons is that the derived paired comparisons are not independent observations. For example, if a ranking has apple > orange and orange > banana, then the outcome apple > banana is fixed, even though in an independent observation we might observe banana > apple. This violates the assumptions of the Bradley-Terry model, so the parameter estimates you get from a function designed to fit a Bradley-Terry model do not correspond to maximum likelihood estimates (of any model) and the standard errors are not valid.

In addition, converting to paired comparisons results in a huge sparse matrix of pairs, so the method can quickly become infeasible with a large number of items and the expansion can also cause numerical issues.

Hope that helps.

ttrodrigz commented 3 years ago

Very helpful, I appreciate the response!

hturner / PlackettLuce

Plackett-Luce vs Bradley-Terry on rank data #45