BillPetti / baseballr

A package written for R focused on baseball analysis. Currently in development.
billpetti.github.io/baseballr
Other
356 stars 97 forks source link

statcast_search has broken due to batspeed and swing length being added #337

Open mlascaleia opened 1 month ago

mlascaleia commented 1 month ago

The tibbles exported from statcast used to have 92 columns, now they have 94!

I foresee this being a continuous error as more and more stats are added. Here is my suggested fix to turn this into something that throws a warning instead of breaking the package:

# (somewhere within the statcast_search function before the payload is searched for)
colos <- c("pitch_type", "game_date", 
            "release_speed", "release_pos_x", "release_pos_z", 
            "player_name", "batter", "pitcher", 
            "events", "description", "spin_dir", 
            "spin_rate_deprecated", "break_angle_deprecated", 
            "break_length_deprecated", "zone", "des", 
            "game_type", "stand", "p_throws", 
            "home_team", "away_team", "type", 
            "hit_location", "bb_type", "balls", 
            "strikes", "game_year", "pfx_x", 
            "pfx_z", "plate_x", "plate_z", 
            "on_3b", "on_2b", "on_1b", "outs_when_up", 
            "inning", "inning_topbot", "hc_x", 
            "hc_y", "tfs_deprecated", "tfs_zulu_deprecated", 
            "fielder_2", "umpire", "sv_id", 
            "vx0", "vy0", "vz0", "ax", 
            "ay", "az", "sz_top", "sz_bot", 
            "hit_distance_sc", "launch_speed", "launch_angle", 
            "effective_speed", "release_spin_rate", 
            "release_extension", "game_pk", "pitcher_1", 
            "fielder_2_1", "fielder_3", "fielder_4", 
            "fielder_5", "fielder_6", "fielder_7", 
            "fielder_8", "fielder_9", "release_pos_y", 
            "estimated_ba_using_speedangle", "estimated_woba_using_speedangle", 
            "woba_value", "woba_denom", "babip_value", 
            "iso_value", "launch_speed_angle", "at_bat_number", 
            "pitch_number", "pitch_name", "home_score", 
            "away_score", "bat_score", "fld_score", 
            "post_away_score", "post_home_score", 
            "post_bat_score", "post_fld_score", "if_fielding_alignment", 
            "of_fielding_alignment", "spin_axis", 
            "delta_home_win_exp", "delta_run_exp")
colNumber <- ncol(payload) 
if(length(colos) != colNumber){
  newCols <- paste("newStat", 1:(length(colos) - colNumber))
  colos <- c(colos, newCols)
  message("New stats detected! baseballr will be updated soon to properly identify these stats")
}
# payload is acquired somewhere in here
# when the payload columns need to be named:
names(payload) <- colos

This way the function will still work when new stats are added, and their names can be updated whenever you update the package

mattm14 commented 1 month ago

it also fails on this function: scrape_statcast_savant_pitcher is there a work around that can be applied, similar to the above?

camdenk commented 1 month ago

Download the dev version with devtools::install_github("BillPetti/baseballr") and this should be fixed!

mlascaleia commented 1 month ago

Thanks for updating! I do want to note with the fix that was implemented the code will still break in the same way if the statcast tibbles are not exactly 94 columns from here on out. Just something worth noting!

camdenk commented 1 month ago

Yep! We're going to add a more permanent fix, but wanted to get the hotfix out asap once the switch was made.

Thanks!

mattm14 commented 1 month ago

thanks for the update!

aglobos49 commented 1 month ago

I reinstalled and am still getting the same column number error. I even did force = TRUE to make sure I got the newest version. Anything else I can try?

kylemccarthy2 commented 1 month ago

I reinstalled and am still getting the same column number error. I even did force = TRUE to make sure I got the newest version. Anything else I can try?

I am having the same issue. Would appreciate any possible help!

camdenk commented 1 month ago

Did you install with install.packages("baseballr") or devtools::install_github("BillPetti/baseballr")?

kylemccarthy2 commented 1 month ago

I used devtools::install_github("BillPetti/baseballr"), then to load in the library it is library(baseballr) correct?

camdenk commented 1 month ago

Yep! Did you restart your R session between installing and then using the package?

kylemccarthy2 commented 3 weeks ago

I believe I got it, thank you so much!

SFall34 commented 6 days ago

Thanks so much for sharing this. Can you please explain how I would work this fix into the following line of code: Season_Data <- scrape_statcast_savant_batter_all(start_date = "2023-09-27", end_date = "2023-10-01")

When I run the next line, colNumber <- ncol(payload) I get the following error: Error in ncol(payload) : object 'payload' not found