dmaclean / dfs-python

Tools for DFS data collection and projection creation.
11 stars 2 forks source link

Issues with overlapping injuries #30

Closed dmaclean closed 10 years ago

dmaclean commented 10 years ago

It looks like we not be fully correcting each injury for a player when they are overlapping. Here's an example from console output when correcting on CJ Miles:

Skipping 2007 - only interested in 2013
Skipping 2008 - only interested in 2013
Skipping 2009 - only interested in 2013
Skipping 2010 - only interested in 2013
Skipping 2011 - only interested in 2013
Skipping 2012 - only interested in 2013
Processing season 2013
200 for /players/m/milescj01/gamelog/2014
INFO:root:Injury (2014-02-10 - 2014-03-02) is a multi-day injury where the player actually played in the middle.Splitting up the injuries to (2014-02-10 - 2014-02-12) and (2014-02-13 - 2014-03-02)
INFO:root:Injury (2014-02-10 - 2014-03-02) is a multi-day injury where the player actually played in the middle.Splitting up the injuries to (2014-02-10 - 2014-02-18) and (2014-02-19 - 2014-03-02)
INFO:root:Injury (2014-02-10 - 2014-03-02) is a multi-day injury where the player actually played in the middle.Splitting up the injuries to (2014-02-10 - 2014-02-19) and (2014-02-20 - 2014-03-02)
200 for /players/m/milescj01/splits/2014
200 for /players/m/millean02.html
Skipping 1999 - only interested in 2013
Skipping 2000 - only interested in 2013
Skipping 2001 - only interested in 2013
Skipping 2002 - only interested in 2013
dmaclean commented 10 years ago

Another example:

Processing season 2013
200 for /players/n/nashst01/gamelog/2014
INFO:root:Injury (2014-02-09 - 2014-03-04) is a multi-day injury where the player actually played on the first day.Changing first day to 2014-02-10
INFO:root:Injury (2014-02-09 - 2014-03-04) is a multi-day injury where the player actually played in the middle.Splitting up the injuries to (2014-02-09 - 2014-02-11) and (2014-02-12 - 2014-03-04)
200 for /players/n/nashst01/splits/2014
dmaclean commented 10 years ago

Fixed.

Added a function in InjuryManager that sweeps through all injuries and determines, based on a player’s game logs, whether that injury is correct. If not, it gets corrected. A change was also made to fix_injury_entry so all conditions are checked instead of just having an else clause. This change has been incorporated into the all_tasks script so it will be run after all new injuries are established.

Files modified:

dmaclean commented 10 years ago

Need to add clean up mechanism for duplicate injuries. Lots of those hanging around.

We'll define a duplicate injury as:

injury1.player_id == injury2.player_id and injury1.injury_date == injury2.injury_date and injury1.return_date == injury2.return_date

Also, running the process is currently yielding this error:

INFO:root:Injury (2014-01-10 - 2014-01-25) is a multi-day injury where the player actually played in the middle.Splitting up the injuries to (2014-01-10 - 2014-01-25) and (2014-01-20 - 2014-01-25)
Traceback (most recent call last):
  File "launcher.py", line 17, in <module>
    injury_manager.fix_injuries(season)
  File "/Users/ap/Desktop/basketballreference/models/injury_manager.py", line 160, in fix_injuries
    cursor.close()
  File "/Library/Python/2.7/site-packages/mysql/connector/cursor.py", line 217, in close
    raise errors.InternalError("Unread result found.")
mysql.connector.errors.InternalError: Unread result found.
dmaclean commented 10 years ago

Fixed issue with unread results in cursor. This was happening because we’d try to perform an update on an injury while there were still cursor results. The solution is to record all injury/game-date pairs (game dates that fall during the alleged injury) and deal with them once all results are read.

Also added duplicate removal.

Files modified: