wcrasta / ESPN-Fantasy-Basketball

Free Fantasy Basketball tools for ESPN Leagues
https://espnfantasy.warrencrasta.com
MIT License
32 stars 15 forks source link

added season strength of schedule rankings #12

Closed richiehu17 closed 4 years ago

richiehu17 commented 4 years ago

Averages each player's weekly opponent's win/loss/draw against other teams over the course of the season to produce a strength of schedule comparison. It has to get the scoreboard and calculate win/loss/draw for each player every week so it takes a while to run.

Finding the specific player h2h matchups just uses that the scoreboard teams matrix already has them paired up in order.

The numbers are probably off for leagues with an odd number of teams, but oh well.

wcrasta commented 4 years ago

Hey @richiehu17,

Thanks for your PR! It's exciting that others are contributing to this project 😄 . I will review your PR this weekend when I have more free time.

wcrasta commented 4 years ago

Can you help me understand how this works? There's quite a bit of code, and I want to make sure I understand the concept before I dive into the code.

Let's use this league as an example: https://fantasy.espn.com/basketball/league/scoreboard?leagueId=633975 Here's the results of the SOS calculation: https://i.imgur.com/0hFydQs.png

If we look at Fantastic Mr. Fox, I'm not understanding how the "average" opponent wins can be 12 when no team in the league has more than 7 wins. I would've thought that:

Average Opponent Wins = (Week 1 Opponent Wins + Week 2 Opponent Wins + ...)/(# of weeks). For example, Fantastic Mr. Fox played Rising Stars week 1. Rising Stars has 4 wins = Week 1 Opponent Wins.

richiehu17 commented 4 years ago

It doesn't use the actual team win-loss for it's calculations. It uses the hypothetical results from the weekly matchups to calculate the values. So for Fantastic Mr. Fox, it means that on average, his weekly opponent beats 12 out of however many total teams there are. I thought it would give the most accurate assessment of strength of schedule akin to like a points against statistic.

More specifically, it's calculated

(Week 1 opponent win/loss/draw against all other teams from weekly matchups) + (Week 2 opponent win/loss/draw against all other teams from weekly matchups) + .... and then divided by weeks

wcrasta commented 4 years ago

I think I understand the concept now, thanks for clearing it up. If it's what I think it is... that's AWESOME. However, the numbers that I computed by hand are slightly different from the ones the page outputs.

Calculating SOS for Fantastic Mr. Fox: Rising Stars: 13-6-0 D Breezy: 17-1-1 Team Carlson: 14-4-1 Toon Squad: 13-6-0 Kobe Wan Kenobi: 16-3-0 Shovelface 3000: 3-16-0 Hennything is Possible: 1-18-0 Ohio NuSxnce: 18-0-1 phi slama jama: 12-7-0

I double checked the records of the opponents. The total # of wins is 107. The # of weeks is 9. 107/9 = 11.88888... not 12. Wondering where the discrepancy is.

wcrasta commented 4 years ago

While it's an awesome concept, one concern I have with this implementation is the # of Selenium web drivers it's gonna spin up. Based on my understanding, if we're on Week 18, it's gonna take 2x longer than if we're on Week 9 for the page to load. Also, selenium tends to be pretty memory heavy.

The application runs on a T2 Micro EC2 instance on AWS. I have a feeling that if even 2 users simultaneously click on this SOS at similar times, the whole application will crash. If my intuition is correct, the page will always timeout. We can always give it a try though. This is the downside of web scraping on demand - it's simply not scalable.

richiehu17 commented 4 years ago

For the numbers, it seemed to be right the few times I went through and checked everything, but my best guess for the discrepancy would be the 5PM/6PM EST games today changing some of this week's stats from your initial request. If you ran the program and still had an average of 12 at the same time you did it manually, then I'll take a closer look later today or tomorrow.

I'm not too familiar with Selenium, but the hope was that it would just be the equivalent of running some number of weekly matchups back to back. You're right about the load times as well. Didn't do any memory analysis either. If you're willing to try it out live, I'm all for it too. After all it's your website 😂

wcrasta commented 4 years ago

I did the calculation again this morning at a time when no games are going on. I came up with the same calculation as the SOS page, which is great.

Later on, I'll look into what would happen to memory usage as this page runs and let you know. I really like this feature and respect the effort you put into developing it, so I would love to deploy it if it doesn't cause the whole application to become unresponsive.

wcrasta commented 4 years ago

I've deployed the code at: http://fantasy.warrencrasta.com/season_sos?leagueId=633975 (you have to go to that link directly, I removed it from the navigation). I notice that:

Headless ChromeDriver is known to be unpredictable with memory and CPU usage. Even though this is all on a T2 Micro instance, even scaling it out horizontally/vertically with more/bigger instances would not solve the issue. Caching the weekly matchup results might help, but it's simply too much effort for this small project. As I mentioned before, scaling something that scrapes on demand with Chrome webdriver is not scalable.

All that being said, unless there's a way to do this without invoking the Weekly Matchups several times, I think we can both agree that this should not be deployed on the http://fantasy.warrencrasta.com/ server. However, I do think this is a really awesome feature and one that I am definitely going to use locally for seasons to come. For that reason, I think the best approach might be for me to put your code in another branch and to call attention to it in the README file. That way, those who are technically proficient enough to run this project locally can take advantage of this feature.

Again, I thank you for the time and effort that you spent in developing this feature. It was unexpected, but it truly makes me happy to see others contribute! Let me know your thoughts. I'll remove that link once you've had a chance to test it out yourself.

richiehu17 commented 4 years ago

I got the time out as well. It sucks that it can't run live. I was really hoping that it would just act like multiple consequent fetches.

Putting it in another branch with a brief shout-out in the README sounds good and is probably the best option. I'm glad my fantasy-salt-induced coding gave you some happiness though. Some pics of my 12-man league. Unfortunately I'm Double 🅱️s https://imgur.com/a/uuLRtzH

wcrasta commented 4 years ago

Check out my latest commit -- feel free to change the attribution in the credits and make a PR if you would like to be recognized a different way.

Also, it looks like you have code for "Overall Performance". You are more than welcome to add a PR into the more-features branch if you'd like. It would be interesting to compare the Overall Performance to the Standings!

wcrasta commented 4 years ago

@richiehu17 Just thought you'd like to know - recently posted about this project and about your contributions on Reddit: https://www.reddit.com/r/fantasybball/comments/eimnj6/updated_my_application_for_analyzing_your_fantasy/.

wcrasta commented 4 years ago

All the attention the website received after I posted it on Reddit reminded me of how unscalable the current code is and how I hate sysadmin/devops stuff lol. It was unusable yesterday and I spent a lot of time configuring things in AWS. I started re-writing all the code to use the ESPN (secret) APIs instead of web scraping through Selenium about a year ago which would make the code thousands of times more efficient, but I stopped that and can't remember where I put the code 😩. Oh well... I probably won't do too much work on this since my league is now on Yahoo. I also prefer Java Spring Boot to Python Flask these days.

Anyway, you might be better at deploying code and scaling it cost-effectively than I am. The code is all there; you are welcome to try and deploy it yourself with your Strength of Schedule implementation if you'd like. I'd gladly redirect my domain to your server if you could get it running. Just a thought, I don't expect you to do any of this lol.