jldbc / pybaseball

Pull current and historical baseball statistics using Python (Statcast, Baseball Reference, FanGraphs)
MIT License
1.23k stars 330 forks source link

Pitching / Batting Stats Mix-up #221

Open zach-lewis opened 3 years ago

zach-lewis commented 3 years ago

I've been having issues with batting_stats and pitching_stats returning unexpected data based on date parameters. Originally, pybaseball.pitching_stats(2021) was returning what would be expected from pybaseball.batting_stats(2021) (batting stats). This could originally be resolved by instead utilizing pybaseball.pitching_stats(2020, 2021) and filtering out 2020.

However, now pitching_stats is behaving normally, but batting_stats is returning pitching stats for any years beyond 2018. Any ideas why this is happening?

Doddy-codes commented 2 years ago

Bump... I am having a similar issue...

Batting stats and pitching stats are both returning the same frame

df = batting_stats(2017, ind=1) # get batting stats for season pitch = pitching_stats(2017, ind=1) pitch.equals(df) # outputs true

tjburch commented 2 years ago

Which version of pybaseball are you using? I wasn't able to reproduce your example with 2.2.1 which is what pip grabs for me.

In [1]: from importlib.metadata import version

In [2]: import pybaseball

In [3]: version('pybaseball')
Out[3]: '2.2.1'

In [4]: batting = pybaseball.batting_stats(2017, ind=1)

In [5]: pitching = pybaseball.pitching_stats(2017, ind=1)

In [6]: batting.equals(pitching)
Out[6]: False

In [7]: batting.head()
Out[7]:
     IDfg  Season               Name Team  Age    G   AB   PA    H   1B  2B  3B  HR    R  RBI   BB  ...  Soft%+  Med%+  Hard%+    EV    LA  Barrels  Barrel%  maxEV  HardHit  HardHit%  Events  CStr%   CSW%    xBA   xSLG  xwOBA
1   15640    2017        Aaron Judge  NYY   25  155  542  678  154   75  24   3  52  128  114  127  ...      61     88     140  94.9  15.8       84    0.249  121.1      186     0.550     338  0.157  0.290  0.281  0.651  0.441
6    5417    2017        Jose Altuve  HOU   27  153  590  662  204  137  39   4  24  112   81   58  ...     103    107      87  86.1   9.7       29    0.057  109.1      143     0.280     511  0.161  0.233  0.271  0.445  0.340
9   15429    2017        Kris Bryant  CHC   25  151  549  665  162   91  38   4  29  111   73   95  ...      80    106     102  87.0  17.0       38    0.089  113.0      158     0.370     427  0.138  0.239  0.248  0.478  0.364
4    4949    2017  Giancarlo Stanton  MIA   27  159  597  692  168   77  32   0  59  123  132   85  ...     112     82     121  91.9  11.1       74    0.169  122.2      199     0.455     437  0.155  0.282  0.272  0.592  0.396
11  13510    2017       Jose Ramirez  CLE   24  152  585  645  186   95  56   6  29  107   83   52  ...      90    100     105  88.2  15.0       24    0.046  108.3      180     0.345     521  0.177  0.231  0.294  0.478  0.353

[5 rows x 319 columns]

In [8]: pitching.head()
Out[8]:
    IDfg  Season               Name Team  Age   W  L  WAR   ERA   G  GS  CG  ShO  SV  BS     IP  TBF  ...  Pull%+  Cent%+  Oppo%+  Soft%+  Med%+  Hard%+    EV    LA  Barrels  Barrel%  maxEV  HardHit  HardHit%  Events  CStr%   CSW%  xERA
5  10603    2017         Chris Sale  BOS   28  17  8  7.6  2.90  32  32   1    0   0   0  214.1  851  ...      95     102     105     100    105      92  86.8  15.0       27    0.055  112.7      149     0.303     492  0.183  0.332  2.49
0   2429    2017       Corey Kluber  CLE   31  18  4  7.2  2.25  29  29   5    3   0   0  203.2  777  ...      96     104     100     132     95      90  85.3  10.5       25    0.053  111.6      137     0.291     471  0.190  0.346  2.47
2   3137    2017       Max Scherzer  WSN   32  16  6  6.4  2.51  31  31   2    0   0   0  200.2  780  ...     108      93      97     100    110      84  86.5  19.1       23    0.052  112.0      130     0.291     446  0.159  0.314  2.43
3  10131    2017  Stephen Strasburg  WSN   28  15  4  5.9  2.52  28  28   1    1   0   0  175.1  701  ...      89     108     105     110    104      87  87.3  10.8       24    0.054  113.5      138     0.312     443  0.175  0.305  2.66
7  15890    2017      Luis Severino  NYY   23  14  6  5.6  2.98  31  31   0    0   0   0  193.1  783  ...     101     105      91     105    106      89  87.2   8.0       30    0.060  111.4      165     0.333     496  0.174  0.304  2.96

[5 rows x 334 columns]
Doddy-codes commented 2 years ago

Hi Tyler,

I have version 2.2.1 as well… perhaps my local environment is messed up (although I can’t see how). Even after reinstalling, I get the following:

input: from importlib.metadata import version from pybaseball import batting_stats, pitching_stats df = batting_stats(2017, ind=1) pitch = pitching_stats(2017, ind=1)

print(version('pybaseball'))

print(pitch.equals(df))

output: 2.2.1 True

Any more information I can provide? From: Tyler Burch @.> Sent: Monday, May 9, 2022 12:14 PM To: jldbc/pybaseball @.> Cc: Doddy-codes @.>; Comment @.> Subject: Re: [jldbc/pybaseball] Pitching / Batting Stats Mix-up (#221)

Which version of pybaseball are you using? I wasn't able to reproduce your example with 2.2.1 which is what pip grabs for me.

In [1]: from importlib.metadata import version

In [2]: import pybaseball

In [3]: version('pybaseball')

Out[3]: '2.2.1'

In [4]: batting = pybaseball.batting_stats(2017, ind=1)

In [5]: pitching = pybaseball.pitching_stats(2017, ind=1)

In [6]: batting.equals(pitching)

Out[6]: False

In [7]: batting.head()

Out[7]:

 IDfg  Season               Name Team  Age    G   AB   PA    H   1B  2B  3B  HR    R  RBI   BB  ...  Soft%+  Med%+  Hard%+    EV    LA  Barrels  Barrel%  maxEV  HardHit  HardHit%  Events  CStr%   CSW%    xBA   xSLG  xwOBA

1 15640 2017 Aaron Judge NYY 25 155 542 678 154 75 24 3 52 128 114 127 ... 61 88 140 94.9 15.8 84 0.249 121.1 186 0.550 338 0.157 0.290 0.281 0.651 0.441

6 5417 2017 Jose Altuve HOU 27 153 590 662 204 137 39 4 24 112 81 58 ... 103 107 87 86.1 9.7 29 0.057 109.1 143 0.280 511 0.161 0.233 0.271 0.445 0.340

9 15429 2017 Kris Bryant CHC 25 151 549 665 162 91 38 4 29 111 73 95 ... 80 106 102 87.0 17.0 38 0.089 113.0 158 0.370 427 0.138 0.239 0.248 0.478 0.364

4 4949 2017 Giancarlo Stanton MIA 27 159 597 692 168 77 32 0 59 123 132 85 ... 112 82 121 91.9 11.1 74 0.169 122.2 199 0.455 437 0.155 0.282 0.272 0.592 0.396

11 13510 2017 Jose Ramirez CLE 24 152 585 645 186 95 56 6 29 107 83 52 ... 90 100 105 88.2 15.0 24 0.046 108.3 180 0.345 521 0.177 0.231 0.294 0.478 0.353

[5 rows x 319 columns]

In [8]: pitching.head()

Out[8]:

IDfg  Season               Name Team  Age   W  L  WAR   ERA   G  GS  CG  ShO  SV  BS     IP  TBF  ...  Pull%+  Cent%+  Oppo%+  Soft%+  Med%+  Hard%+    EV    LA  Barrels  Barrel%  maxEV  HardHit  HardHit%  Events  CStr%   CSW%  xERA

5 10603 2017 Chris Sale BOS 28 17 8 7.6 2.90 32 32 1 0 0 0 214.1 851 ... 95 102 105 100 105 92 86.8 15.0 27 0.055 112.7 149 0.303 492 0.183 0.332 2.49

0 2429 2017 Corey Kluber CLE 31 18 4 7.2 2.25 29 29 5 3 0 0 203.2 777 ... 96 104 100 132 95 90 85.3 10.5 25 0.053 111.6 137 0.291 471 0.190 0.346 2.47

2 3137 2017 Max Scherzer WSN 32 16 6 6.4 2.51 31 31 2 0 0 0 200.2 780 ... 108 93 97 100 110 84 86.5 19.1 23 0.052 112.0 130 0.291 446 0.159 0.314 2.43

3 10131 2017 Stephen Strasburg WSN 28 15 4 5.9 2.52 28 28 1 1 0 0 175.1 701 ... 89 108 105 110 104 87 87.3 10.8 24 0.054 113.5 138 0.312 443 0.175 0.305 2.66

7 15890 2017 Luis Severino NYY 23 14 6 5.6 2.98 31 31 0 0 0 0 193.1 783 ... 101 105 91 105 106 89 87.2 8.0 30 0.060 111.4 165 0.333 496 0.174 0.304 2.96

[5 rows x 334 columns]

— Reply to this email directly, view it on GitHubhttps://github.com/jldbc/pybaseball/issues/221#issuecomment-1121476813, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AI2LUYG3UJMJCKPGU5KXN63VJFPZBANCNFSM475PGZCQ. You are receiving this because you commented.Message ID: @.**@.>>

BrayanMnz commented 1 year ago

I think this can be closed because of the two PR merged. @tjburch