DimaKudosh / pydfs-lineup-optimizer

Daily Fantasy Sports lineup optimzer for all popular daily fantasy sports sites
MIT License
409 stars 155 forks source link

Enhancement to generating top expected value lineup #29

Open rogerfitz opened 5 years ago

rogerfitz commented 5 years ago

Thanks for creating this! I wanted to share a pruning algorithm that ensures maximum expected value and may be useful for players wanting the top lineup optimized towards one category. This can be an alternative approach to what you mentioned here https://pydfs-lineup-optimizer.readthedocs.io/en/latest/performance-and-optimization.html

Prunes player pool to generate maxium expected value follows:
    if a player has a higher predicted score and lower salary, remove all players with a higher salary than the player
Additionally
    if all position spots are unused, keep up to the remaining number of slots left
    with a lower score than the player we are on but higher predicted point total than the next best option
Here's an example:
    Odell beckam  salary 8000 predicted 19
    Alshon Jeffrey salary 7900 predicted 20
    Mohammed sanu salary 5100 predicted 15

Because there are 3+1 spots for WR (including FLEX), we can't remove odell beckham because he would contribute more to our point total than removing him. If there was only 1 WR slot we'd be safe to remove him.

I can create a PR some day if your interested. I've used this over at https://www.blog.sportsdatadirect.com/2018/08/22/draftkings-2018-millionaire-maker-recaps/ and have consistently been able to find the top lineups possible using my pruned player pool. Using your pulp solver version I've gotten matching results for 6 out of 6 weeks I tested with my pruning alg

DimaKudosh commented 5 years ago

It will be cool if you create PR with your algorithm. Honestly, I don't understand second part of your issue description about Beckham

edoublin commented 5 years ago

are there specific pros you follow on draftkings / study lineups from? that website is really great, have never seen it before.. im wondering how he determines what is a professional player in his data

sansbacon commented 5 years ago

Pruning is useful to speed up the optimizer by narrowing the search space. The less choices the optimizer has to consider, the faster it will run. We know from experience that a QB with a projection of 15 is not going to end up in an optimal lineup, but the optimizer does not, and will waste time considering it.

I don't think pruning has any impact on the quality of the optimal solution, as determined by mixed integer linear programming (MILP) techniques. The positional constraints in the model already ensure that the chosen lineup is optimal given the available projections, salaries, and lineup slots. If Jeffrey is the better play given salary, position, and projection, the optimizer will detect that and put him in your lineup.

What the original poster described as a pruning algorithm sounds like a heuristic for narrowing your own player pool. I think it is useful if you are manually building lineups but you don't need to tell the optimizer who the best plays are. It will tell you based on your projections, the salaries, and any position constraints. Using the example above, if Jeffrey is a better play, then he will end up in the optimal lineup regardless of whether you use a pruning algorithm.

One reason not to prune the player pool too much is that projections are very fragile and you might eliminate good plays. In the example above, you can't have great confidence that Jeffrey will outscore Beckham, and Beckham could be a very strong play to consider. So, instead of removing players from the pool entirely before you run the optimizer, leave them in and use the randomization feature to see how the margin of error in projections changes the optimal lineup (you can specify a range for the random alteration, although the defaults are sensible).

rogerfitz commented 5 years ago

Sorry thought I would see an email notification but just saw this.

@DimaKudosh Sounds good. Will update you when PR ready. I'll try to elaborate on Odell Beckham. If you want to find the highest possible lineup you can't remove a player with worse points at a higher salary because of the number of players at different position groups. You need to have the top 3 plus a flex. So if you had 3 WR's and 1 Flex players with lower salary or equivalent than Odell and higher points then you can safely exclude him from the pruned player pool and still guarantee you will find the top lineup. Most weeks this takes the ~300 players in NFL down to about 50.

@edoublin Yes. I use rotogrinder's top 50 football rankings and then track past winners. Right now I'm working on creating my own rankings because RG's are too heavily skewed to high volume players.

@sansbacon Yes exactly. Goal is just to speed up runtime in case where it doesn't affect desired output (optimal lineup). Personally i don't prune lineups I build to play very often because of the fragility of projections but I prune when looking for the top lineup for my recaps

mywebpower commented 5 years ago

Off topic, great library

fivehorizons commented 5 years ago

Did a PR ever get done? @rogerfitz I would love to see your pruning in action.

justreallygood commented 5 years ago

@rogerfitz any luck with the pruning algorithm?

rogerfitz commented 5 years ago

@fivehorizons @justreallygood just started working on this PR but I don't think I'll be able to get it done until end of january. In the mean time, here's the algo https://nbviewer.jupyter.org/github/sportsdatadirect/python_tutorials/blob/master/Lineup%20Optimizer.ipynb scroll to "prune". I need to use the site constraints in pydf_lineup_optimizer to compute the number of position groups. Created a preliminary branch here https://github.com/SportsDataDirect/pydfs-lineup-optimizer/tree/feature/pruning_alg