Weird results when training Dixon model

martineastwood / penaltyblog

Library from http://pena.lt/y/blog for modelling and working with football (soccer) data

MIT License

56 stars 10 forks source link

Train with 100

df = fb.get_fixtures().iloc[:100] print(df) weight = pb.models.dixon_coles_weights(df["date"], 0.001) clf = pb.models.DixonColesGoalModel( df["goals_home"], df["goals_away"], df["team_home"], df["team_away"], weight ) clf.fit()

print(clf) print(clf.predict("Olympiakos", "Asteras Tripolis"))

I could not find why this happens, could you maybe take a look? Thanks, Zsolt

Thanks Zsolt - it looks like the optimiser is coming up with a value for rho that is breaking Dixon and Cole's adjustment factor. I suspect it's because you're using quite a small amount of data so the model is not converging well and so the optimiser's output is quite volatile.

Adding in the previous season's data as well helps the model converge better.

df = pd.concat(
    [
        pb.scrapers.FootballData("GRC Super League", "2021-2022").get_fixtures(),
        pb.scrapers.FootballData("GRC Super League", "2022-2023").get_fixtures(),
    ]
)[:-2]

weight = pb.models.dixon_coles_weights(df["date"], 0.001)
clf = pb.models.DixonColesGoalModel(
df["goals_home"], df["goals_away"], df["team_home"], df["team_away"], weight
)
clf.fit()

print(clf)
print(clf.predict("Olympiakos", "Asteras Tripolis"))

df = pd.concat(
    [
        pb.scrapers.FootballData("GRC Super League", "2021-2022").get_fixtures(),
        pb.scrapers.FootballData("GRC Super League", "2022-2023").get_fixtures(),
    ]
)[:-1]

weight = pb.models.dixon_coles_weights(df["date"], 0.001)
clf = pb.models.DixonColesGoalModel(
df["goals_home"], df["goals_away"], df["team_home"], df["team_away"], weight
)
clf.fit()

print(clf)
print(clf.predict("Olympiakos", "Asteras Tripolis"))

I'll look into adding constraints around the value that rho is allowed to be to help minimise this in the future

martineastwood / penaltyblog

Weird results when training Dixon model #7

Train with 99

Train with 100