BurntSushi / nfldb

A library to manage and update NFL data in a relational database.
The Unlicense
1.07k stars 263 forks source link

New NFL gamebooks code #210

Open andr3w321 opened 7 years ago

andr3w321 commented 7 years ago

This is not so much of an open issue - more of an announcement/enhancement. I decided to open source my NFL gamebooks code that should work well with this nfldb project which you can find here: https://github.com/andr3w321/nflgamebooks I figured I would post here so it would get coverage. The main thing I use it for is stadium data and starting QB data, but it has plenty more info not included in this project.

BurntSushi commented 7 years ago

Cool! If you submit a PR with a link to your project in nfldb's README then I'd be fine with that!

andr3w321 commented 7 years ago

I can't seem to find the xml URLs for week 9 or NFL stopped posting them? Anyone have a clue? Week 8 urls work fine http://www.nflgsis.com/2016/Reg/08/57020/Gamebook.xml

However even the Thursday night game isn't up http://www.nflgsis.com/2016/Reg/09/57021/Gamebook.xml

The pdf is there though http://www.nflgsis.com/2016/Reg/09/57021/Gamebook.pdf

tsteussie commented 7 years ago

@andr3w321 - I am also seeing the same issue. I tried signing into nflgsis under multiple usernames, thinking there may be a permissions issue, and continue to have the same problem. Wondering if this may be due to a licensing agreement the NFL signed with sportradar (exclusive access for a certain time period after the game), but it does not make sense that it starts happening week 9 of the season. I have noticed that nflgsis just posted the detailed lineup information (see attached screenshot) for the Rams and a few other teams. Possibly an error within a larger update?

capture

Sajorin commented 7 years ago

Hello, I have a project very similar to yours. I am experiencing the same issue as of today, Do you have any idea if it has happened before?

andr3w321 commented 7 years ago

Not that I'm aware of. It's clearly affecting a lot of people's code though. Even football outsiders is reporting technical difficulties this week which I believe are due to this issue.

iliketowel commented 7 years ago

For what it's worth, Football Outsiders posted the following note on their page:

Update: Unfortunately, the technical issues with NFL feeds mean we will not be able to update drive stats or pace stats for an unknown period of time. We are working on a solution to the problem, but until then, those pages will not be updated.

I would think they are paying for feed and it would be odd for NFL to stop in middle of season without warning.

Sajorin commented 7 years ago

I agree, it is so strange this is happening in mid season

tsteussie commented 7 years ago

I have spent a fair amount of time searching the web and there is no mention of this issue. Are you aware if the NFL JSON feed is still functioning?

Todd Steussie Co-founder, Executive Vice-President, PotentiaMetrics, President, PotentiaPRO Office: 866-285-7841 Ext. 73 | Fax: 314-248-0100 tsteussie@potentiametrics.com | http://www.potentiametrics.com

This email and any files transmitted may contain confidential and proprietary information for the use of the recipient named above. If you are not the intended recipient, you are notified that you are not authorized to disclose, copy, distribute, or take any action in the reliance on the contents of this email. Please notify the sender immediately by email if you have received this message in error and delete this email and any attachments from your computer. Thank you.

On Nov 9, 2016, at 4:16 PM, Sajorin notifications@github.com wrote:

I agree, it is so strange this is happening in mid season

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

tsteussie commented 7 years ago

Not since I began my project in 2013. There have been a few broken links that have been resolved within 24-48 hours. I have a few connections at the NFL. If this issue is not resolved by next week, I will make some calls to get more information.

Todd

Todd Steussie Co-founder, Executive Vice-President, PotentiaMetrics, President, PotentiaPRO Office: 866-285-7841 Ext. 73 | Fax: 314-248-0100 tsteussie@potentiametrics.com | http://www.potentiametrics.com

This email and any files transmitted may contain confidential and proprietary information for the use of the recipient named above. If you are not the intended recipient, you are notified that you are not authorized to disclose, copy, distribute, or take any action in the reliance on the contents of this email. Please notify the sender immediately by email if you have received this message in error and delete this email and any attachments from your computer. Thank you.

On Nov 9, 2016, at 4:01 PM, Sajorin notifications@github.com wrote:

Hello, I have a project very similar to yours. I am experiencing the same issue as of today, Do you have any idea if it has happened before?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

ochawkeye commented 7 years ago

Are you aware if the NFL JSON feed is still functioning?

If you're referring to the JSON from the Game Center that powers nfldb, then yep, that is still working through all games to date.

import nfldb

db = nfldb.connect()
q = nfldb.Query(db)
q.game(season_year=2016, week=9, season_type='Regular')

for game in q.as_games():
    print game
Regular 2016 week 9 on 11/03 at 07:25PM, ATL (43) at TB (28)
Regular 2016 week 9 on 11/06 at 07:30PM, DEN (20) at OAK (30)
Regular 2016 week 9 on 11/06 at 12:00PM, PIT (14) at BAL (21)
Regular 2016 week 9 on 11/06 at 12:00PM, DAL (35) at CLE (10)
Regular 2016 week 9 on 11/06 at 12:00PM, JAC (14) at KC (19)
Regular 2016 week 9 on 11/06 at 12:00PM, NYJ (23) at MIA (27)
Regular 2016 week 9 on 11/06 at 12:00PM, DET (22) at MIN (16)
Regular 2016 week 9 on 11/06 at 12:00PM, PHI (23) at NYG (28)
Regular 2016 week 9 on 11/06 at 03:05PM, CAR (13) at LA (10)
Regular 2016 week 9 on 11/06 at 03:05PM, NO (41) at SF (23)
Regular 2016 week 9 on 11/06 at 03:25PM, IND (31) at GB (26)
Regular 2016 week 9 on 11/06 at 03:25PM, TEN (35) at SD (43)
Regular 2016 week 9 on 11/07 at 07:30PM, BUF (25) at SEA (31)

Obligatory "SKOL Vikings"

tsteussie commented 7 years ago

Thanks! SKOL Vikings

On Wed, Nov 9, 2016 at 9:38 PM, ochawkeye notifications@github.com wrote:

Are you aware if the NFL JSON feed is still functioning?

If you're referring to the JSON from the Game Center that powers nfldb, then yep, that is still working through all games to date.

import nfldb

db = nfldb.connect() q = nfldb.Query(db) q.game(season_year=2016, week=9, season_type='Regular') for game in q.as_games(): print game

Regular 2016 week 9 on 11/03 at 07:25PM, ATL (43) at TB (28) Regular 2016 week 9 on 11/06 at 07:30PM, DEN (20) at OAK (30) Regular 2016 week 9 on 11/06 at 12:00PM, PIT (14) at BAL (21) Regular 2016 week 9 on 11/06 at 12:00PM, DAL (35) at CLE (10) Regular 2016 week 9 on 11/06 at 12:00PM, JAC (14) at KC (19) Regular 2016 week 9 on 11/06 at 12:00PM, NYJ (23) at MIA (27) Regular 2016 week 9 on 11/06 at 12:00PM, DET (22) at MIN (16) Regular 2016 week 9 on 11/06 at 12:00PM, PHI (23) at NYG (28) Regular 2016 week 9 on 11/06 at 03:05PM, CAR (13) at LA (10) Regular 2016 week 9 on 11/06 at 03:05PM, NO (41) at SF (23) Regular 2016 week 9 on 11/06 at 03:25PM, IND (31) at GB (26) Regular 2016 week 9 on 11/06 at 03:25PM, TEN (35) at SD (43) Regular 2016 week 9 on 11/07 at 07:30PM, BUF (25) at SEA (31)

Obligatory "SKOL Vikings"

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/BurntSushi/nfldb/issues/210#issuecomment-259594210, or mute the thread https://github.com/notifications/unsubscribe-auth/AGOZ6d8m82MkPQEfWpzP_0Iq7j8-hvAZks5q8pHBgaJpZM4KnvoL .

Todd Steussie

Co-founder, Executive Vice-President, PotentiaMetrics, President, PotentiaPRO

Office: 866-285-7841 Ext. 73 | Fax: 314-248-0100

tsteussie@potentiametrics.com | http://www.potentiametrics.com

http://www.potentiametrics.com


[image: Facebook] http://www.potentiametrics.com https://www.facebook.com/potentiametrics [image: Twitter] https://twitter.com/PotentiaMetrics [image: Google Plus] https://plus.google.com/115670021481207889252 [image: Linkedin] https://www.linkedin.com/company/potentia-metrics

Sajorin commented 7 years ago

Thank you so much Todd

andr3w321 commented 7 years ago

I still don't see any xml gamebooks for week 10. All the new pdfs are posted like last week though.
http://www.nflgsis.com/2016/Reg/10/57035/Gamebook.xml http://www.nflgsis.com/2016/Reg/10/57035/Gamebook.pdf

tsteussie commented 7 years ago

I see the same thing. I will reach out this week to see when they will be restoring the links.

Todd

On Sun, Nov 13, 2016 at 8:18 PM, andr3w321 notifications@github.com wrote:

I still don't see any xml gamebooks for week 10. All the new pdfs are posted like last week though.

http://www.nflgsis.com/2016/Reg/10/57035/Gamebook.xml http://www.nflgsis.com/2016/Reg/10/57035/Gamebook.pdf

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/BurntSushi/nfldb/issues/210#issuecomment-260233664, or mute the thread https://github.com/notifications/unsubscribe-auth/AGOZ6UwcT0U4teRK5K3hWRJNT4lkdrw-ks5q98T5gaJpZM4KnvoL .

Todd Steussie

Co-founder, Executive Vice-President, PotentiaMetrics, President, PotentiaPRO

Office: 866-285-7841 Ext. 73 | Fax: 314-248-0100

tsteussie@potentiametrics.com | http://www.potentiametrics.com

http://www.potentiametrics.com


[image: Facebook] http://www.potentiametrics.com https://www.facebook.com/potentiametrics [image: Twitter] https://twitter.com/PotentiaMetrics [image: Google Plus] https://plus.google.com/115670021481207889252 [image: Linkedin] https://www.linkedin.com/company/potentia-metrics

andr3w321 commented 7 years ago

Any news Todd? It would be nice to know if they stopped posting on purpose or if it's a temporary bug that may one day be fixed.

tsteussie commented 7 years ago

Sorry, I have not had a chance to inquire. I will reach out to a couple of contacts next week.

Todd

On Mon, Nov 21, 2016 at 2:08 PM, andr3w321 notifications@github.com wrote:

Any news Todd? It would be nice to know if they stopped posting on purpose or if it's a temporary bug that may one day be fixed.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/BurntSushi/nfldb/issues/210#issuecomment-262052140, or mute the thread https://github.com/notifications/unsubscribe-auth/AGOZ6Z7jMwqQkgGMBiJFwF3APNjvmznsks5rAfotgaJpZM4KnvoL .

Todd Steussie

Co-founder, Executive Vice-President, PotentiaMetrics, President, PotentiaPRO

Office: 866-285-7841 Ext. 73 | Fax: 314-248-0100

tsteussie@potentiametrics.com | http://www.potentiametrics.com

http://www.potentiametrics.com


[image: Facebook] http://www.potentiametrics.com https://www.facebook.com/potentiametrics [image: Twitter] https://twitter.com/PotentiaMetrics [image: Google Plus] https://plus.google.com/115670021481207889252 [image: Linkedin] https://www.linkedin.com/company/potentia-metrics

danabrey commented 6 years ago

Did you ever get any insight on this @tsteussie ? I've been trying to parse Gamebook PDFs with (as you'd expect) limited results. That XML would be a godsend!

tsteussie commented 6 years ago

Yes, I did. The commercial license with Sportradar restricts the data that can be provided on NFLGSIS site. Computer readable files will no longer be available on this site.

Depending on how determined you are, I have created a template for extracting the play by play data and stats from the PSR PDF files. Still needs quite a bit of data parsing, but all the information is there. I use the program PDF2XL to convert to XLS, then do the rest of the processing in PYTHON, but can use EXCEL as well. Toughest part is matching text descriptions of players (within the play by play data) with the player names listed in JSON feed (Team rosters).

Best of luck to you!

Todd

Todd Steussie Co-founder, Executive Vice-President, PotentiaMetrics Office: 866-285-7841 Ex. 1002 | Fax: 314-248-0100 tsteussie@potentiametrics.com | http://www.potentiametrics.com

This email and any files transmitted may contain confidential and proprietary information for the use of the recipient named above. If you are not the intended recipient, you are notified that you are not authorized to disclose, copy, distribute, or take any action in the reliance on the contents of this email. Please notify the sender immediately by email if you have received this message in error and delete this email and any attachments from your computer. Thank you.

On Nov 15, 2017, at 4:31 PM, Dan Abrey notifications@github.com wrote:

Did you ever get any insight on this @tsteussie ? I've been trying to parse Gamebook PDFs with (as you'd expect) limited results. That XML would be a godsend!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

danabrey commented 6 years ago

Thanks a lot for that, @tsteussie. I'll investigate the possibility of getting what I need from parsing the PDFs.

I'm able to get the play-by-play data and stats that I need from a combination of the Game Center JSON files and another fantasy API, but what I'm really interested in getting hold of is the snap count data from the final section of the PDFs.

I note that earlier in this issue, it was mentioned that Football Outsiders had hit problems in updating their snap count data due to the disappearance of the XML version of the Gamebook. I wonder how they are still able to update their data every Tuesday! It surely can't be a manual process; perhaps they're parsing the PDFs too, or they have an alternative private API source.

tsteussie commented 6 years ago

The most difficult part of working with the snap count data is the small number of games (mainly pre-season games) that have 2 players with the same text description and position. For example, in 2016 the Arizona Cardinals have multiple games with 2 players being listed under the same name and position.

I don't think the only way of handling these is manually.

On Thu, Nov 16, 2017 at 7:36 AM, Dan Abrey notifications@github.com wrote:

Thanks a lot for that, @tsteussie https://github.com/tsteussie. I'll investigate the possibility of getting what I need from parsing the PDFs.

I'm able to get the play-by-play data and stats that I need from a combination of the Game Center JSON http://www.nfl.com/liveupdate/game-center/2017091701/2017091701_gtd.json files and another fantasy API, but what I'm really interested in getting hold of is the snap count data from the final section of the PDFs.

I note that earlier in this issue, it was mentioned that Football Outsiders had hit problems in updating their snap count data due to the disappearance of the XML version of the Gamebook. I wonder how they are still able to update their data every Tuesday http://www.footballoutsiders.com/stats/snapcounts! It surely can't be a manual process; perhaps they're parsing the PDFs too, or they have an alternative private API source.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BurntSushi/nfldb/issues/210#issuecomment-344924363, or mute the thread https://github.com/notifications/unsubscribe-auth/AGOZ6RkYVOtMt-mAN52A4ZngMla5_3cYks5s3DpqgaJpZM4KnvoL .

--

Todd Steussie

Co-founder, Executive Vice-President, PotentiaMetrics

Office: 866-285-7841 Ext. 1002 | Mobile: 314-825-0624

tsteussie@potentiametrics.com | http://www.potentiametrics.com

http://www.potentiametrics.com


[image: Facebook] http://www.potentiametrics.com https://www.facebook.com/potentiametrics [image: Twitter] https://twitter.com/PotentiaMetrics [image: Google Plus] https://plus.google.com/115670021481207889252 [image: Linkedin] https://www.linkedin.com/company/potentia-metrics

tsteussie commented 6 years ago

I believe that information is available for purchase, but I’m not positive.

Todd Steussie Co-founder, Executive Vice-President, PotentiaMetrics Office: 866-285-7841 Ex. 1002 | Fax: 314-248-0100 tsteussie@potentiametrics.com | http://www.potentiametrics.com

This email and any files transmitted may contain confidential and proprietary information for the use of the recipient named above. If you are not the intended recipient, you are notified that you are not authorized to disclose, copy, distribute, or take any action in the reliance on the contents of this email. Please notify the sender immediately by email if you have received this message in error and delete this email and any attachments from your computer. Thank you.

On Nov 16, 2017, at 7:36 AM, Dan Abrey notifications@github.com wrote:

Thanks a lot for that, @tsteussie. I'll investigate the possibility of getting what I need from parsing the PDFs.

I'm able to get the play-by-play data and stats that I need from a combination of the Game Center JSON files and another fantasy API, but what I'm really interested in getting hold of is the snap count data from the final section of the PDFs.

I note that earlier in this issue, it was mentioned that Football Outsiders had hit problems in updating their snap count data due to the disappearance of the XML version of the Gamebook. I wonder how they are still able to update their data every Tuesday! It surely can't be a manual process; perhaps they're parsing the PDFs too, or they have an alternative private API source.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

JimHewitt commented 6 years ago

@tsteussie Regarding your comment about the "J.Brown" issue, the json play by play data includes the playerid for players involved for each play (there are actually 2 playerids NFL.com uses, but that is a different story) For example the two ARI J.Browns are from a SQL Server database I created by scraping NFL.com by playerid. ( I also have all of the json play by play data since 2009 in a SQL Server database).

00-0030300 ARI J.Brown wr Jaron Brown 00-0031051 ARI J.Brown wr John Brown

danabrey commented 6 years ago

@JimHewitt I believe that's in reference to the snap count 'data' available in the Gamebook PDFs, which only list players by name.

tsteussie commented 6 years ago

Dan's comment is correct.

On Thu, Jul 19, 2018 at 2:39 PM, Dan Abrey notifications@github.com wrote:

@JimHewitt https://github.com/JimHewitt I believe that's in reference to the snap count 'data' available in the Gamebook PDFs, which only list players by name.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BurntSushi/nfldb/issues/210#issuecomment-406390900, or mute the thread https://github.com/notifications/unsubscribe-auth/AGOZ6V6wEOsaAnHE1snT6sfmgoGZYLaoks5uIOCCgaJpZM4KnvoL .

--

Todd Steussie

Chief Operating Officer, PotentiaMetrics

Office: 866-285-7841 Ext. 1002 | Mobile: 314-825-0624

tsteussie@potentiametrics.com | http://www.potentiametrics.com

http://www.potentiametrics.com


[image: Facebook] http://www.potentiametrics.com https://www.facebook.com/potentiametrics [image: Twitter] https://twitter.com/PotentiaMetrics [image: Google Plus] https://plus.google.com/115670021481207889252 [image: Linkedin] https://www.linkedin.com/company/potentia-metrics

JimHewitt commented 6 years ago

@danabrey OK, I see the issue. In this case you would need to know that John Brown played special teams and Jaron didn't. However, that doesn't really solve the larger issue.