JeffSackmann / tennis_wta

WTA Tennis Rankings, Results, and Stats
217 stars 144 forks source link

WTA match data : commit_8a2d756_thru_01_jan_2019 : 'Makarova bug' still propagating #17

Closed bazzaar closed 4 years ago

bazzaar commented 5 years ago

Hi, Building on from #14 , the 'Makarova bug' is still prevalent, and occurs in the latest update ( commit_8a2d756_thru_01_jan_2019). Thankfully it's just 10 match records that are in error this time, from the following two tourneys :

tourney_type tourney_id tourney_name tourney_date count
WTA Qual ITF 2018-W-WITF-RSA-02A-2018 Stellenbosch $15K 2018-11-26 5
WTA Qual ITF 2018-W-WITF-RSA-03A-2018 Stellenbosch $15K 2018-12-03 5

These are all matches where the winner_id is given as #201505, but that's the wrong id. The winner of these matches was 'Ekaterina Makarova 1996' (#221247), not Ekaterina Makarova (yob : 1988).

There's no loser_id's impacted in the update, 'Ekaterina Makarova 1996' won both these tourneys.

As in #14 , the remedy is to change the winner_id, winner_name, etc on the incorrect match records.

Hope this helps, bazzaar

JeffSackmann commented 4 years ago

now fixed. the rankings, surprisingly, were all correct. Note that there's a third(!) Ekaterina Makarova, listed as age 18 (though I don't have the exact dob), who played a 2016 Czech ITF. I created a player record for her as well. She's never been ranked.

It is very possible that if Makarova 1996 comes back (she appears to be injured now), the bug will resurface in the 2020 results. I'll figure out a more permanent solution via my parser at that time.