jldbc / pybaseball

Pull current and historical baseball statistics using Python (Statcast, Baseball Reference, FanGraphs)
MIT License
1.19k stars 324 forks source link

Identifying team of batter/pitcher in statcast data #286

Closed jrussek closed 1 year ago

jrussek commented 1 year ago

Hi, I'm a bit lost trying to figure out how to get accurate team data for the batter in a pitch recorded by statcast. I understand I can use playerid_reverse_lookup to fetch the name of the player, but I'm lost trying to find an accurate way to get the team membership of the player at the day of the game. Is this missing or am I just not finding it?

tjburch commented 1 year ago

All stat cast methods have home_team and away_team. You can use this with inning_topbot to create the column you want. An example:

In [1]: from pybaseball import  statcast_pitcher
   ...: import numpy as np
   ...: import pandas as pd

In [2]: data = statcast_pitcher("2016-04-01", "2017-07-15", player_id=519242)  # Chris Sale
Gathering Player Data

In [3]: data["player_team"] = np.where(data.inning_topbot == "Top", data.home_team, data.away_team)

In [4]: data.groupby("game_year")['player_team'].agg(pd.Series.mode)
Out[4]:
game_year
2016    CWS
2017    BOS
Name: player_team, dtype: object
tjburch commented 1 year ago

@jrussek - is that sufficient for your use?

jrussek commented 1 year ago

Hi @tjburch, apologies for the silence. Yes, that works wonderfully, thank you!