BillPetti / baseballr

A package written for R focused on baseball analysis. Currently in development.
billpetti.github.io/baseballr
Other
367 stars 100 forks source link

Statcast-Baseballr Potential discrepancy in Barrels values between Savantbaseball web SEARCH and Baseballr scraped data #316

Closed t75rsj1 closed 8 months ago

t75rsj1 commented 9 months ago

Hi Baseballr Support -

I am a Statcast/BaseballR newbie, and I recently noticed a discrepancy between BARRELS counts for 2023 regular season selected batters between the SavantBaseball online web SEARCH tool and data that I scraped to my own database using the V3.0 'annual_statcast_query' tool that is mentioned at https://billpetti.github.io/2021-04-02-build-statcast-database-rstats-version-3.0/.

In the V3.0 annual_statcast_query R code, the following statement is used to assign a value of 0 or 1 to the BARRELS column value for a given pitch to a batter: dplyr::mutate(barrel = ifelse(launch_angle <= 50 & launch_speed >= 98 & launch_speed * 1.5 - launch_angle >= 117 & launch_speed + launch_angle >= 124, 1, 0))

However based on the results given by the SavantBaseball online web SEARCH tool, I very humbly believe this statement should be coded as: dplyr::mutate(barrel = ifelse(launch_angle <= 50 & launch_speed >= 97 & launch_speed * 1.5 - launch_angle >= 117 & launch_speed + launch_angle >= 123, 1, 0))

Attached to this Issue submission is a PDF report that shows the details for this discrepancy for 3 selected players (not for the entire roster of 2023 players) using data from the New Issue Statcast-Baseballr-potential-discrepency-in-BARRELS-values.pdf

SavantBaseball online web SEARCH tool and then query results from data I scraped to a database using the V3.0 annual_statcast_query R code.

Thanks. Robert S Johnson