VincentLa / draftkings

1 stars 0 forks source link

Create a Linear Program that finds the maximum points given salary and position constraints #3

Open VincentLa opened 4 years ago

VincentLa commented 4 years ago

This issue is dependent on two others.

  1. Getting the NBA Data to get raw stats (https://github.com/VincentLa/draftkings/issues/1)
  2. Converting the raw stats to DraftKing points (https://github.com/VincentLa/draftkings/issues/2)
KGe001 commented 4 years ago

First draft LP optimizer code commited. Update again once live estimates are fed in

VincentLa commented 4 years ago

Nice. Super minor comments: I did a bit of directory re-org to keep things a bit more organized. Hope that doesn't break anything?

Also added a requirements.txt package to control what the python packages will be using. What are you using to compile code?

VincentLa commented 4 years ago

@KGe001 I got a basic scraper up and running and put the box score stats into here: https://github.com/VincentLa/draftkings/blob/master/data/raw/nba_box_score_stats/nba_box_score_stats_20200805.csv

Includes each player, their pertinent stats and then the final column draftkings_points is the final calculated points for DraftKings that aggregates all their stats.

Do you want to see if you can work with this in your optimizer? Tomorrow, I can pull down the stats for today's games and since we actually have salaries for 2020-08-06 we can actually run on real data.

KGe001 commented 4 years ago

Cool - I can easily sub in the points column and run the optimizer, but since this is just box score for 20200805, I dont think it has all the players since not everyone players everyday. We would need to stitch together several days of data and generate a calculated EV

VincentLa commented 4 years ago

Ok added in several box scores starting from the start of the season July 30th, 2020!

VincentLa commented 4 years ago

@KGe001 this is minor but can we put https://github.com/VincentLa/draftkings/blob/master/mappingtable.csv into /data/processed/mapping_table.csv?

KGe001 commented 4 years ago

Moved - I've also added a folder to save historical problem runs - Not sure how useful having all the problem data will be but this lets us revisit/recreate the solution in the future

VincentLa commented 4 years ago

@KGe001 -- I updated a bunch of stuff to point to the new DraftKings Salary data and cleaned up the Player Map table. Just some stylistic things:

Because I changed up some things, I think I may have messed up some of what optimizer.py depends on? Getting an error like:

(draftkings) vincentla@Vincents-MacBook-Pro draftkings (master) $ python -m src.optimizer
Welcome to the CBC MILP Solver 
Version: 2.9.0 
Build Date: Feb 12 2015 

command line - /Users/vincentla/.pyenv/versions/draftkings/lib/python3.7/site-packages/pulp/apis/../solverdir/cbc/osx/64/cbc /var/folders/58/15bsk5_n0tv03tgn3hbbm9d40000gn/T/e21caaac17794832be60b64eed412a90-pulp.mps max ratio None allow None threads None presolve on strong None gomory on knapsack on probing on branch printingOptions all solution /var/folders/58/15bsk5_n0tv03tgn3hbbm9d40000gn/T/e21caaac17794832be60b64eed412a90-pulp.sol (default strategy 1)
At line 2 NAME          MODEL
At line 3 ROWS
At line 14 COLUMNS
Bad image at line 32 <     X0000002  C0000000   nan >
Bad image at line 118 <     X0000014  C0000000   nan >
Bad image at line 131 <     X0000016  C0000000   nan >
Bad image at line 157 <     X0000020  C0000000   nan >
Bad image at line 214 <     X0000028  C0000000   nan >
Bad image at line 234 <     X0000031  C0000000   nan >
Bad image at line 271 <     X0000036  C0000000   nan >
Bad image at line 300 <     X0000040  C0000000   nan >
Bad image at line 430 <     X0000057  C0000000   nan >
Bad image at line 498 <     X0000067  C0000000   nan >
Bad image at line 503 <     X0000068  C0000000   nan >
Bad image at line 520 <     X0000071  C0000000   nan >
Bad image at line 593 <     X0000081  C0000000   nan >
Bad image at line 661 <     X0000091  C0000000   nan >
Bad image at line 672 <     X0000093  C0000000   nan >
Bad image at line 748 <     X0000103  C0000000   nan >
Bad image at line 753 <     X0000104  C0000000   nan >
Bad image at line 828 <     X0000114  C0000000   nan >
Bad image at line 983 <     X0000135  C0000000   nan >
Bad image at line 988 <     X0000136  C0000000   nan >
Bad image at line 993 <     X0000137  C0000000   nan >
Bad image at line 1083 <     X0000150  C0000000   nan >
Bad image at line 1297 <     X0000179  C0000000   nan >
Bad image at line 1310 <     X0000181  C0000000   nan >
At line 1314 RHS
Bad image at line 1315 <     RHS       C0000000   nan >
At line 1324 BOUNDS
At line 1507 ENDATA
Problem MODEL has 9 rows, 182 columns and 734 elements
Coin0008I MODEL read with 25 errors
There were 25 errors on input
String of None is illegal for double parameter ratioGap value remains 0
String of None is illegal for double parameter allowableGap value remains 0
String of None is illegal for integer parameter threads value remains 0
String of None is illegal for integer parameter strongBranching value remains 5
Option for gomoryCuts changed from ifmove to on
Option for knapsackCuts changed from ifmove to on
** Current model not valid
Option for printingOptions changed from normal to all
** Current model not valid
No match for /var/folders/58/15bsk5_n0tv03tgn3hbbm9d40000gn/T/e21caaac17794832be60b64eed412a90-pulp.sol - ? for list of commands
Total time (CPU seconds):       0.00   (Wallclock seconds):       0.00

Traceback (most recent call last):
  File "/Users/vincentla/.pyenv/versions/3.7.4/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/Users/vincentla/.pyenv/versions/3.7.4/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/vincentla/git/draftkings/src/optimizer.py", line 91, in <module>
    main()
  File "/Users/vincentla/git/draftkings/src/optimizer.py", line 74, in main
    status = model.solve()
  File "/Users/vincentla/.pyenv/versions/draftkings/lib/python3.7/site-packages/pulp/pulp.py", line 1890, in solve
    status = solver.actualSolve(self, **kwargs)
  File "/Users/vincentla/.pyenv/versions/draftkings/lib/python3.7/site-packages/pulp/apis/coin_api.py", line 101, in actualSolve
    return self.solve_CBC(lp, **kwargs)
  File "/Users/vincentla/.pyenv/versions/draftkings/lib/python3.7/site-packages/pulp/apis/coin_api.py", line 159, in solve_CBC
    raise PulpSolverError("Pulp: Error while executing "+self.path)
pulp.apis.core.PulpSolverError: Pulp: Error while executing /Users/vincentla/.pyenv/versions/draftkings/lib/python3.7/site-packages/pulp/apis/../solverdir/cbc/osx/64/cbc

Any chance you can review and see if you can debug? You might be more familiar with the package.

KGe001 commented 4 years ago

Should be fixed - all fields being used to optimize have to be defined. In this case, there were players with nan salary. Added a line to drop players with missing salaries.