panzarino / mlbgame

A Python API to retrieve and read MLB GameDay data
http://panz.io/mlbgame/
MIT License
529 stars 112 forks source link

What does the 'additional' player stats show that the regular ones don't? #89

Open bbennett36 opened 6 years ago

bbennett36 commented 6 years ago

The player_stats module has the following stats available -home_batting,home_additional_batting,home_pitching,home_additional_pitching,away_batting,away_additional_batting,away_pitching,away_additional_pitching

It looks like some of the stats in the 'additional' method return contain the some ones from the regular stats method. I need to know which ones are the same and which ones are different.

Example -

away_pitching_bb                                     2
away_pitching_bf                                    37
away_pitching_bs                                     0
away_pitching_er                                     0
away_pitching_era                                 10.8
away_pitching_game_score                           245
away_pitching_h                                      6
away_pitching_hld                                    1
away_pitching_hr                                     0
away_pitching_id                               2589025
away_pitching_l                                      0
away_pitching_np                                   155
away_pitching_out                                   27
away_pitching_r                                      0
away_pitching_s                                    103
away_pitching_s_bb                                   3
away_pitching_s_er                                   2
away_pitching_s_h                                    8
away_pitching_s_ip                                10.5
away_pitching_s_r                                    2
away_pitching_s_so                                   7
away_pitching_so                                     7
away_pitching_sv                                     0
away_pitching_w                                      1
game_id                     2018_04_02_clemlb_anamlb_1
away_addtl_pitching_ao                                      5
away_addtl_pitching_bam_bs                                  0
away_addtl_pitching_bam_era                              10.8
away_addtl_pitching_bam_hld                                 1
away_addtl_pitching_bam_l                                   0
away_addtl_pitching_bam_s_bb                                3
away_addtl_pitching_bam_s_er                                2
away_addtl_pitching_bam_s_h                                 8
away_addtl_pitching_bam_s_ip                             10.5
away_addtl_pitching_bam_s_r                                 2
away_addtl_pitching_bam_s_so                                7
away_addtl_pitching_bam_sv                                  0
away_addtl_pitching_bam_w                                   1
away_addtl_pitching_bb                                      2
away_addtl_pitching_bf                                     37
away_addtl_pitching_bis_bs                                  0
away_addtl_pitching_bis_era                              10.8
away_addtl_pitching_bis_hld                                 1
away_addtl_pitching_bis_l                                   0
away_addtl_pitching_bis_s_bb                                3
away_addtl_pitching_bis_s_er                                2
away_addtl_pitching_bis_s_h                                 8
away_addtl_pitching_bis_s_ip                             10.5
away_addtl_pitching_bis_s_r                                 2
away_addtl_pitching_bis_s_so                                7
away_addtl_pitching_bis_sv                                  0
away_addtl_pitching_bis_w                                   1
away_addtl_pitching_bk                                      0
away_addtl_pitching_er                                      0
away_addtl_pitching_game_score                            245
away_addtl_pitching_go                                      8
away_addtl_pitching_h                                       6
away_addtl_pitching_hr                                      0
away_addtl_pitching_id                                2589025
away_addtl_pitching_ir                                      1
away_addtl_pitching_ira                                     0
away_addtl_pitching_np                                    155
away_addtl_pitching_out                                    27
away_addtl_pitching_pitch_order                           510
away_addtl_pitching_r                                       0
away_addtl_pitching_s                                     103
away_addtl_pitching_so                                      7
game_id                            2018_04_02_clemlb_anamlb_1

For example in the pitching stats, I'm not sure what the 'bam' and 'bis' suffixes are showing because they are the same as the other fields.

Where can I find definitions for the fields that are coming from the additional methods? I'm looking to create my own YTD stats for each player and need to make sure it is as accurate as possible. I don't really see any distinct patterns to make a safe decisions.

Thanks!

trevor-viljoen commented 6 years ago

As part of my #85 PR, I'm cleaning up some Code Climate issues. One of those issues is redundant code. As a fix for that, I'm going to combine the "additional" stats with the standard stats object. As for which stats are available for each object:

  1. basic stats come from boxscore.xml

  2. additional stats come from rawboxscore.xml

I've also got a PR in the works for pulling season stats without having to traverse every game of the season, which puts a high load on the system compiling this information due to the hundreds of HTTP requests necessary to sequentially pull that data.

bbennett36 commented 6 years ago

@trevor-viljoen

so if a stat is in both objects (example - away_pitching_s_bb & away_addtl_pitching_bam_s_bb), do you know if they will they ALWAYS contain the same value for that statistic in both objects?

If that's the case it will be easy for me to just merge and de-dupe columns. I was just worried that this will not always be the case.

trevor-viljoen commented 6 years ago

@bbennett36 I believe they will always be the same. I'm not sure what MLB's reasoning is for having a boxscore and a rawboxscore containing lots of duplicate stats. One note of caution with the s_* stats, they are "season" stats, but "season" may be a given round of the playoffs. For instance, I believe in the World Series example above the s_* stats are only for the WS and do not include other rounds of the playoffs. They definitely don't include the regular season.

bbennett36 commented 6 years ago

@trevor-viljoen Sounds good, maybe ill run some testing to double check. Yeah I did see the 's_' stats from other closed issues regarding season to date stats but I don't really trust it which is why I was just going to calculate it myself.

I'll keep you updated on my findings since you are pretty much working on the same problems. I might be able to help you with the PR updates.