JeffSackmann / tennis_wta

WTA Tennis Rankings, Results, and Stats
217 stars 144 forks source link

check for missing/inconsistent seeds #57

Closed JeffSackmann closed 9 months ago

JeffSackmann commented 2 years ago

Seeds are listed on a match-by-match basis, but of course they don't change during an event. If a player is seeded 1st in one match, she will be seeded 1st in all of her matches at that tournament.

There are a lot of gaps in the older data -- e.g. seeding is only given for a player in one of her matches for that tournament.

It's less common, though probably still present, that a player is listed with multiple seeds in the same event.

Finally, we should check that only one player has a given seed per event. In some older events, there were foreign and domestic seeds (e.g. '1' and '1F'), and sometimes the 'F' gets dropped, leaving us with a confusing situation.

First, need to check the existing data for these issues. Second, it would then be a pre-commit check I can run against new data as it comes in.

amirbachar commented 2 years ago

Do you still need help with that? I can provide you such lists if you still do.