Open VincentLa opened 4 years ago
First draft LP optimizer code commited. Update again once live estimates are fed in
Nice. Super minor comments: I did a bit of directory re-org to keep things a bit more organized. Hope that doesn't break anything?
optimizer.py
let's keep all file names lowercase?df = pd.read_csv(r'data\raw\draftkings_salaries\DKNBASalaries_20200806.csv')
, let's use os.path.join
. This looks like a Windows path? I'm on mac/linux. Windows uses '\' whereas mac/linux uses '/' to denote directory paths. However, if we use os.path.join
it will automatically set paths depending on the operating system.Also added a requirements.txt package to control what the python packages will be using. What are you using to compile code?
@KGe001 I got a basic scraper up and running and put the box score stats into here: https://github.com/VincentLa/draftkings/blob/master/data/raw/nba_box_score_stats/nba_box_score_stats_20200805.csv
Includes each player, their pertinent stats and then the final column draftkings_points
is the final calculated points for DraftKings that aggregates all their stats.
Do you want to see if you can work with this in your optimizer? Tomorrow, I can pull down the stats for today's games and since we actually have salaries for 2020-08-06 we can actually run on real data.
Cool - I can easily sub in the points column and run the optimizer, but since this is just box score for 20200805, I dont think it has all the players since not everyone players everyday. We would need to stitch together several days of data and generate a calculated EV
Ok added in several box scores starting from the start of the season July 30th, 2020!
@KGe001 this is minor but can we put https://github.com/VincentLa/draftkings/blob/master/mappingtable.csv into /data/processed/mapping_table.csv?
Moved - I've also added a folder to save historical problem runs - Not sure how useful having all the problem data will be but this lets us revisit/recreate the solution in the future
@KGe001 -- I updated a bunch of stuff to point to the new DraftKings Salary data and cleaned up the Player Map table. Just some stylistic things:
Because I changed up some things, I think I may have messed up some of what optimizer.py
depends on? Getting an error like:
(draftkings) vincentla@Vincents-MacBook-Pro draftkings (master) $ python -m src.optimizer
Welcome to the CBC MILP Solver
Version: 2.9.0
Build Date: Feb 12 2015
command line - /Users/vincentla/.pyenv/versions/draftkings/lib/python3.7/site-packages/pulp/apis/../solverdir/cbc/osx/64/cbc /var/folders/58/15bsk5_n0tv03tgn3hbbm9d40000gn/T/e21caaac17794832be60b64eed412a90-pulp.mps max ratio None allow None threads None presolve on strong None gomory on knapsack on probing on branch printingOptions all solution /var/folders/58/15bsk5_n0tv03tgn3hbbm9d40000gn/T/e21caaac17794832be60b64eed412a90-pulp.sol (default strategy 1)
At line 2 NAME MODEL
At line 3 ROWS
At line 14 COLUMNS
Bad image at line 32 < X0000002 C0000000 nan >
Bad image at line 118 < X0000014 C0000000 nan >
Bad image at line 131 < X0000016 C0000000 nan >
Bad image at line 157 < X0000020 C0000000 nan >
Bad image at line 214 < X0000028 C0000000 nan >
Bad image at line 234 < X0000031 C0000000 nan >
Bad image at line 271 < X0000036 C0000000 nan >
Bad image at line 300 < X0000040 C0000000 nan >
Bad image at line 430 < X0000057 C0000000 nan >
Bad image at line 498 < X0000067 C0000000 nan >
Bad image at line 503 < X0000068 C0000000 nan >
Bad image at line 520 < X0000071 C0000000 nan >
Bad image at line 593 < X0000081 C0000000 nan >
Bad image at line 661 < X0000091 C0000000 nan >
Bad image at line 672 < X0000093 C0000000 nan >
Bad image at line 748 < X0000103 C0000000 nan >
Bad image at line 753 < X0000104 C0000000 nan >
Bad image at line 828 < X0000114 C0000000 nan >
Bad image at line 983 < X0000135 C0000000 nan >
Bad image at line 988 < X0000136 C0000000 nan >
Bad image at line 993 < X0000137 C0000000 nan >
Bad image at line 1083 < X0000150 C0000000 nan >
Bad image at line 1297 < X0000179 C0000000 nan >
Bad image at line 1310 < X0000181 C0000000 nan >
At line 1314 RHS
Bad image at line 1315 < RHS C0000000 nan >
At line 1324 BOUNDS
At line 1507 ENDATA
Problem MODEL has 9 rows, 182 columns and 734 elements
Coin0008I MODEL read with 25 errors
There were 25 errors on input
String of None is illegal for double parameter ratioGap value remains 0
String of None is illegal for double parameter allowableGap value remains 0
String of None is illegal for integer parameter threads value remains 0
String of None is illegal for integer parameter strongBranching value remains 5
Option for gomoryCuts changed from ifmove to on
Option for knapsackCuts changed from ifmove to on
** Current model not valid
Option for printingOptions changed from normal to all
** Current model not valid
No match for /var/folders/58/15bsk5_n0tv03tgn3hbbm9d40000gn/T/e21caaac17794832be60b64eed412a90-pulp.sol - ? for list of commands
Total time (CPU seconds): 0.00 (Wallclock seconds): 0.00
Traceback (most recent call last):
File "/Users/vincentla/.pyenv/versions/3.7.4/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/Users/vincentla/.pyenv/versions/3.7.4/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/Users/vincentla/git/draftkings/src/optimizer.py", line 91, in <module>
main()
File "/Users/vincentla/git/draftkings/src/optimizer.py", line 74, in main
status = model.solve()
File "/Users/vincentla/.pyenv/versions/draftkings/lib/python3.7/site-packages/pulp/pulp.py", line 1890, in solve
status = solver.actualSolve(self, **kwargs)
File "/Users/vincentla/.pyenv/versions/draftkings/lib/python3.7/site-packages/pulp/apis/coin_api.py", line 101, in actualSolve
return self.solve_CBC(lp, **kwargs)
File "/Users/vincentla/.pyenv/versions/draftkings/lib/python3.7/site-packages/pulp/apis/coin_api.py", line 159, in solve_CBC
raise PulpSolverError("Pulp: Error while executing "+self.path)
pulp.apis.core.PulpSolverError: Pulp: Error while executing /Users/vincentla/.pyenv/versions/draftkings/lib/python3.7/site-packages/pulp/apis/../solverdir/cbc/osx/64/cbc
Any chance you can review and see if you can debug? You might be more familiar with the package.
Should be fixed - all fields being used to optimize have to be defined. In this case, there were players with nan salary. Added a line to drop players with missing salaries.
This issue is dependent on two others.