efficiency of offered odds

ioannis12 commented 3 years ago

Hello,

first of all congratulations for your analysis. I had a look at your data and have some questions, regarding the efficiency of the bookmaker's closing odds, I noticed that you were using brier score and log-likelihood to see the relationship between the actual probabilities and the implies ones. You conclude that the implied probabilities are very close to the actuals. I used your data to see if the frequencies of the winning odds agree with the implied probabilities.( used histograms to bin the odds and the gaussian_kde() method in scipy.stats for the density) For example odds for 2.0 appear just 37% in the winning column, whereas odds for 3.0 appear 11% and 4.0 appears just 3% (I attached the density plot). Isn't this a sign of an inefficient market? Higher odds are being underrepresented, aka favorite longshot bias.

I also noticed there are often big movements of the odds during open and close. I suppose they are generated by the amount of bets wagered of each outcome, as bookmakers try to keep their book balanced. Βut seems all bookmakers move the same direction, without significant differences, does this mean they all have pretty much the same book/exposure? Or is it that they just want to prevent arbitrage possibilities so they adjust accordingly, if this is the case then why would they care if there are arbitrage possibilities, as long as they have their books balanced and have a certain profit from the margin.

thanks again for your analysis and hope it didn't bother you that I used your dataset.

kind regards

iankotliar commented 3 years ago

I used brier score and log likelihood to compare different techniques for removing the bookmaker's vig, not to assess the efficiency of the odds. Efficiency can be better assessed after removing the bookmaker's vig. Looking at a gaussian plot doesn't really make sense when analyzing efficiency. What you want is a reliability plot.

If you plot just the reciprocal of the 'closing' decimal odds to the % of fights won you should find that the values are below the 45 degree line because of the dealer's vig. I put closing in quotes because in some cases the odds are actually live odds/in-fight odds so I actually took odds a few hours prior to the final published odds. Take a look at the reliability plots I made. I adjust for the dealer's vig so that in an efficient market you'd expect the points to lie on the 45 degree line, which they largely do.

I did not really spend a lot of time analyzing the movement of odds but generally speaking they all move together in light of new information. I think the less sharp bookies tend to adjust their odds to match the sharper books. An analysis of line movements could be an interesting project.

ioannis12 commented 3 years ago

thanks for the reply, I see what you mean, I see the reliability plots, low implied probabilities are below the diagonal line. It just makes such a big difference the bookmaker vig. I used the density plot to see how often the odds happen.

I just run your function bb def bb(x):

    inverted_odds = x**-1
    zz = np.sum(inverted_odds) - 1.0
    return (inverted_odds - zz)/(1-zz)

is this 100% correct? I test with bb(2) and get 0.66

If the bookies change their odds according to new information then they don't necessarily keep their books balanced, or have some other ways to hedge their exposure.

iankotliar commented 3 years ago

You need to supply a vector of 2 decmial odds (one for each fighter) to bb. One example might be 1.9 and 1.9. Then each fighter has a breakeven probability of winning of 1/1.9 = 52.6%. The total probabilities sum to 105.2% and the total vig is 5.2%. bb returns estiamted fair odds probabilities of 50% and 50%.

I recommend you read my section in the read-me on how the vig works. I give examples and talk through everything.

iankotliar / UFC_Final

efficiency of offered odds #1