Closed msussman closed 6 years ago
@msussman I'll poke around and see what data MLB provides.
I'm unable to find anything in the XML files around attendance, though the venue name is widely available.
I'm digging around in their new stats API statsapi.mlb.com/docs but the lack of documentation isn't getting me very far.
I did find attendance in their live game endpoint, buried in an info property, here's a Dodgers @ Marlins game in 2016
"info": [
{
"label": "Game Scores",
"value": "Kershaw 41; Fernandez 84."
},
{
"label": "HBP",
"value": "Suzuki, I (by Fields, J)."
},
{
"label": "Pitches-strikes",
"value": "Kershaw 66-46; Coleman 10-6; Norris, B 15-7; Howell 1-1; Fields, J 9-3; Avilan 20-12; Ravin 17-11; Liberatore 7-5; Fernandez 102-66; Barraclough 14-11; Rodney 16-11; Ramos 11-8."
},
{
"label": "Groundouts-flyouts",
"value": "Kershaw 3-1; Coleman 1-1; Norris, B 1-1; Howell 0-0; Fields, J 0-0; Avilan 3-0; Ravin 1-0; Liberatore 0-2; Fernandez 4-1; Barraclough 2-0; Rodney 0-0; Ramos 0-1."
},
{
"label": "Batters faced",
"value": "Kershaw 14; Coleman 3; Norris, B 4; Howell 1; Fields, J 2; Avilan 5; Ravin 3; Liberatore 3; Fernandez 27; Barraclough 4; Rodney 3; Ramos 3."
},
{
"label": "Inherited runners-scored",
"value": "Howell 2-1; Fields, J 2-1; Avilan 3-0; Ramos 2-0."
},
{
"label": "Umpires",
"value": "HP: Brian Knight. 1B: Tony Randazzo. 2B: Bill Miller. 3B: Tom Woodring."
},
{
"label": "Weather",
"value": "77 degrees, roof closed."
},
{
"label": "Wind",
"value": "0 mph, None."
},
{
"label": "First pitch",
"value": "7:11 PM."
},
{
"label": "T",
"value": "3:08."
},
{
"label": "Att",
"value": "22,940."
},
{
"label": "Venue",
"value": "Marlins Park"
},
{
"label": "September 9, 2016"
}
],
It might be available through some smaller endpoints as this one is ENORMOUS, but would require some more research. As it is, I'm thinking this would be a good feature for mlbgame v3 (Stats API usage) since this info doesn't appear to be available in their XML files.
It's available through rawboxscore.xml
. Here's a random example: http://gd2.mlb.com/components/game/mlb/year_2017/month_04/day_16/gid_2017_04_16_milmlb_cinmlb_1/rawboxscore.xml
<boxscore wind="16 mph, R to L" game_type="R" venue_name="Great American Ball Park" attendance="12,625" home_sport_code="mlb" official_scorer="Mike Cameron" game_pk="490277" date="April 16, 2017" status_ind="F" home_league_id="104" elapsed_time="2:56" game_id="2017/04/16/milmlb-cinmlb-1" venue_id="2602" start_time="1:10 PM" weather="73 degrees, overcast" gameday_sw="P">
@trevor-viljoen If you have time, could you go ahead and add that in a PR. I don't really have much time but I could get to it if you can't.
@panzarino I'll try to find some time to do it this week. I'll also take a look at rawboxscore vs boxscore and see how different they are from each other. The fix might be as simple as using rawboxscore instead of boxscore.
Is anyone working on this? I've done some work with MLB's rawboxscore.xml
and boxscore.xml
in another project. I'd be interested in doing this.
Have at it, I could use another release soon.
Which class should the attendance
attribute go in? Each of the classes is associated with a corresponding xml file in the MLB API: Overview
goes with linescore.xml
, GameBoxScore
goes with boxscore.xml
, and GameScoreboard
goes with scoreboard.xml
. I think attendance data would fit the best in the Overview
class, where were you expecting it to go?
@Pertempto I think that it would fit well with the other stats provided by the Overview
class.
Great! I'm working on it now, and I'll probably have a pull request in the next few hours.
I've implemented the attendance feature, but I was wondering if I should add all the top level attributes from rawboxscore.xml
to the game overviews. Here is an example. This would add useful data like the weather, wind, elapsed time, and exact start time. The only problem I found is that rawboxscore.xml
includes a venue_name
attribute, while the Overview class already has an venue
attribute. Is it bad to have two attributes with the same value? Maybe I should add all the attributes from rawboxscore.xml
and remove the venue_name
attribute.
@Pertempto It would be great if you could add that. It is fine to have duplicate attributes, just report everything.
@panzarino Where am I supposed to "report everything"? Is it as simple as adding the new attributes to the Overview
docstring or do I need to document these new attributes somewhere else as well?
I've created a pull request with all the new attributes. #71
Merged
Thanks all. When are you planning on releasing a new version?
On Sat, Mar 24, 2018, 3:47 PM Zach Panzarino notifications@github.com wrote:
Merged
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/panzarino/mlbgame/issues/68#issuecomment-375919863, or mute the thread https://github.com/notifications/unsubscribe-auth/AIEW6xH-GIbpdqnTpT0Qrg5JYbTn_eMQks5thqLfgaJpZM4Sa9QM .
I could definitely use a 2.5. Only missing piece from the LED board before the regular season starts is the probable starters which is just waiting for a new tag.
I'll work on updating a few things and hopefully pushing out a new release in the coming days.
Sounds great! Looking forward to checking out the update.
On Sun, Mar 25, 2018 at 2:05 PM Zach Panzarino notifications@github.com wrote:
I'll work on updating a few things and hopefully pushing out a new release in the coming days.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/panzarino/mlbgame/issues/68#issuecomment-375990471, or mute the thread https://github.com/notifications/unsubscribe-auth/AIEW63td8NwGtqNfUTM77JNJIkNMn4Jwks5th9xmgaJpZM4Sa9QM .
@msussman Sorry to disappoint, but I just looked at my schedule and I have almost no time this week so I'll have to push it back to sometime next week.
No worries. I won't need it in earnest until next month
On Mon, Mar 26, 2018, 3:46 PM Zach Panzarino notifications@github.com wrote:
@msussman https://github.com/msussman Sorry to disappoint, but I just looked at my schedule and I have almost no time this week so I'll have to push it back to sometime next week.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/panzarino/mlbgame/issues/68#issuecomment-376288698, or mute the thread https://github.com/notifications/unsubscribe-auth/AIEW67L835ZdkRcU5B8m2Vq5pc8PqsGWks5tiUWhgaJpZM4Sa9QM .
@panzarino, I just updated to the new release and am finding some issues with the attendance implementation.
Mixed Data Times: When attendance is data is missing due to a double header, the attendance attribute is set to 0 (ex. Gameid 2011/07/02/pitmlb-wasmlb-1, 'attendance': 0), but when it's populated it's a string (ex. '2011/07/02/pitmlb-wasmlb-2', 'attendance': '39,636')
Game Ids not found with Game_Overview method: I had pulled all games for the Nationals 2010-2018 previously without issue, but when I tried with the new release I'm getting quite a few games where the Game_Overview method returns this error "ValueError: Could not find a game with that id." Example - 2010_04_03_bosmlb_wasmlb_1
@Pertempto could you look into this?
@panzarino just updated my local repo, hopefully I'll be able to look at it in next week.
Sorry, but I was busy last week and wasn't able to work on this. Someone else might want to do this because I can't promise that I'll have the time.
Wanted to reach out to see if attendance data was available through the API