BurntSushi / nflgame

An API to retrieve and read NFL Game Center JSON data. It can work with real-time data, which can be used for fantasy football.
http://pdoc.burntsushi.net/nflgame
The Unlicense
1.27k stars 412 forks source link

Data Integrety Issues #352

Closed derek-adair closed 4 years ago

derek-adair commented 6 years ago

I'm filing this issue because I see no concerted effort to alleviate the data pulling issues, in particular how update_schedule.py works, namely that it is not automatically up-to-date upon install.

Just glancing at the issue's, here are the ones filed because the data isn't automatically updated;

And a WHOOOLE bunch of closed issues

What I would like to see:

I'm sure this list will grow, i'm just not sure what else could be done to improve this, or at least prevent the initial confusion for people installing this for the first time.

Any thoughts?

BurntSushi commented 6 years ago

because I see no concerted effort to alleviate the data pulling issues

There is no concerted effort to do anything. This project is basically unmaintained at this point because I don't have the time or interest to continue maintaining it. And yes, I should add a note to the README but haven't done so.

derek-adair commented 6 years ago

It's all good @BurntSushi, you should know we all adore your work. This project has become a cornerstone for an app i've built so i'd like to help.

derek-adair commented 6 years ago

Before the next season I will need some fashion of updating schedule data automatically, so if you have any tips on how I may achieve this that'd be awesome.

BurntSushi commented 6 years ago

I don't have any specific ideas. I think it's just a matter of figuring out how to manage internal state ("what I know about the world at some particular time") with external state ("what I could know about the world at some particular time"). And in particular, making sure you don't expend gratuitous resources in managing that state. For example, the player updating script goes through numerous heuristics to avoid sending a lot of additional HTTP requests to NFL.com. The initial crawl will send a lot, but future updates are incremental.

As a clear example, "updating the schedule" should not involve "sending dozens of HTTP requests to download the entire schedule for the past several years." That might seem obvious, but it's just an example. :)