keberwein / mlbgameday

Multi-core processing of 'Gameday' data from Major League Baseball Advanced Media. Additional tools to parallelize large data sets and write them to a database.
Other
41 stars 8 forks source link

No Team ID in get_payload function #15

Closed atroiano closed 5 years ago

atroiano commented 5 years ago

Can't seem to find any information about the teams playing in the get_payload function call. Would be nice to get some ID of which player is on what team.

atroiano commented 5 years ago

A made a PR that will do this

keberwein commented 5 years ago

By the way, this is possible with a simple join. I'm reviewing the PR now. I'm a little concerned about removing the large data set warning, so I'm going to do some testing today on one of my smaller machines.

Thanks a lot BTW!

library(mlbgameday)
library(dplyr)

innings <- get_payload("2019-04-09", "2019-04-09")

team <- get_payload("2019-04-09", "2019-04-09", dataset = "linescore")

final <- team$game %>% select(gameday_link, away_team_city, away_team_name, away_division, away_league_id,
                         home_team_city, home_team_name, home_division, home_league_id) %>%
  left_join(innings$atbat, by = "gameday_link")
atroiano commented 5 years ago

Ok cool, I didn't realize that. Maybe we add a flag to disable the message? I have an AWS environment that has plenty of RAM to load into memory and I can't figure out how to scrape without having to stay around and press 1.

Didn't realize about the join.

I also I need to get probable pitchers for future games.

In the mlbgames python package, I can do the following (this was in the morning on the 11th).



In [2]: games = mlb.games(years=[2019], months=[4], days=[11])

In [3]: game = games[0][0]

In [4]: game.game_id
Out[4]: '2019_04_11_miamlb_cinmlb_1'

In [5]: game.p_pitcher_away
Out[5]: 'Sonny Gray'

In [6]: game.p_pitcher_home
Out[6]: 'Pablo Lopez' 
keberwein commented 5 years ago

@atroiano I added your name to the contributors in the DESCRIPTION. Feel free to add an email address if you want and do a pull req. Or, if you don't want to, that's OK too.

https://github.com/keberwein/mlbgameday/blob/master/DESCRIPTION

I also updated the version to 0.2.1 and the NEWS file, so it might be a good idea to do a new pull. If you want to get around the large data set warning for new, just do an install from the stable devel. branch with devtools::install_github("keberwein/mlbgameday")