keberwein / mlbgameday

Multi-core processing of 'Gameday' data from Major League Baseball Advanced Media. Additional tools to parallelize large data sets and write them to a database.
Other
41 stars 8 forks source link

Umpire IDs #8

Closed aaronbaggett closed 5 years ago

aaronbaggett commented 6 years ago

Hi @keberwein, in which table are the umpire IDs stored? In other words, is the home plate umpire who made df$pitch$des not available in the data? I understand you have the script for updating umpire IDs, but I do not see the umpire ID in any of the tables from the get_payload() call.

Hope it's clear what I'm asking.

keberwein commented 6 years ago

If I understand you correctly, you're looking to match a home plate ump to a specific pitch thrown? If so, the short answer is no.

The long answer is, you stumbled onto a half-finished feature. The umpire ids come from a table called players.xml. My idea was/is to gather the home plate umpire for each game_id and then do a join somewhere in the transform process. I'm just not quite there yet.

The package doesn't have an "official" method for gathering these data yet, but if you're looking to get a match on umpire_id and game_id, the code I use to scrape ump_ids could probably be altered to do so. The code for that is here. Sorry if it's a bit messy--it's written as an internal utility ;)

https://github.com/keberwein/mlbgameday/blob/master/data-raw/update_umpires.R

I'm going to keep this open until I can get that feature done. Honestly, it sort of fell the radar a bit.

keberwein commented 5 years ago

My apologies, this enhancement went in a few months ago and I forgot to post. In the current version 0.2.0, you can get umps with the following:

umps <- mlbgameday:umpire_ids