jldbc / pybaseball

Pull current and historical baseball statistics using Python (Statcast, Baseball Reference, FanGraphs)
MIT License
1.25k stars 333 forks source link

pitching_stats and batting_stats not returning all data #428

Open quansta opened 3 months ago

quansta commented 3 months ago

using pitching_stats and batting_stats and the full set of players are not returned from this call

bdilday commented 3 months ago

you may need to set qual=0? i.e. https://github.com/jldbc/pybaseball/issues/410#issuecomment-2067572318

quansta commented 3 months ago

trying to get all the stats from 2023 for batters - using the below with explicitly specifying some that should be defaulted, including qual. it only returns about 150 rows.

pb.batting_stats(2023, end_season=None, league='all', qual=None)

bdilday commented 3 months ago

I think None is not the same thing as 0 in this context. You have to set qual=0, example,

In [10]: batting_stats(2023, end_season=None, league='all', qual=None).loc[:, ["PA"]].describe()
Out[10]: 
               PA
count  134.000000
mean   609.395522
std     64.924829
min    502.000000
25%    554.000000
50%    611.000000
75%    660.750000
max    753.000000

In [11]: batting_stats(2023, end_season=None, league='all', qual=0).loc[:, ["PA"]].describe()
Out[11]: 
                PA
count  1457.000000
mean    126.358270
std     201.427445
min       0.000000
25%       0.000000
50%       0.000000
75%     197.000000
max     753.000000